aep>  eedcks 


Topics in 3 TDHBABER 
Modern | 
Mathematics 2 


i 


1 te 


i nbs hs —_ : 
mae tay : Deshe CM 
i Nae e! a) SAK 1 : 


rept uw ATT» ow 7 

Lr eee in 10 ' 
ues Hi I aril su te 

oi" Tit f | n” iis a] 

Oi ela AS 

afl ¢ i 

YY ‘ 

ay NM st he 


Biles he Sedaitay 
f if s } , A i 
i Vina om Pipl it 


uae 


Topics in 
modern 
mathematics 2 


Topics in 
modern 
mathematics 2 


T. D. H. Baber 


Ph.D., M.Sc., B.Sc., Dip. Ed. 
Principal of Farnborough Technical College 


4a 


Pitman Publishing 


First published 1973 


The paperback edition of 

this book may not be lent, re-sold, hired out, or otherwise 
disposed of by way of trade in any form of binding 

or cover other than that in which it is published 

without the prior consent of the publishers 


SIR ISAAC PITMAN AND SONS LTD. 


Pitman House, Parker Street, Kingsway, London, WC2B 5PB 
P.O. Box 6038, Portal Street, Nairobi, Kenya 


SIR ISAAC PITMAN (AUST.) PTY. LTD. 
Pitman House, 158 Bouverie Street, Carlton, Victoria 3053, Australia 


PITMAN PUBLISHING COMPANY S.A. LTD. 
P.O. Box 11231, Johannesburg, S. Africa 


PITMAN PUBLISHING CORPORATION 
6 East 43rd Street, New York, N.Y. 10017, U.S.A. 


SIR ISAAC PITMAN (CANADA) LTD. 
495 Wellington St West, Toronto 135, Canada 


THE COPP CLARK PUBLISHING COMPANY 
517 Wellington St West, Toronto 135, Canada 


© T. D. H. Baber 1973 


Cased edition: ISBN 0 273 31680 X 
Paperback edition: ISBN 0 273 31682 6 


Made in Great Britain at the Pitman Press, Bath 
G3-(T.374/1361: 75) 


—— a O_O EE EEE oO 


Preface 


Over recent years, the approach to the teaching of mathematics has changed 
considerably in recognition of the need to provide a more effective programme 
of mathematical education in schools and colleges. There is general agree- 
ment that curricula and teaching methods require modernization. Reforms 
have been proposed by various organizations with the aims of promoting an 
early understanding of the basic structure of mathematics and of eliminating 
outmoded traditional material. 

In various modern programmes, attention has been concentrated upon 
algebra, analysis and geometry treated in a more advanced and abstract way 
than formerly. The traditional approach, by which these subjects were 
compartmentalized, has been discarded and the present aim is to achieve a 
unified approach by which the relationships between these subjects are fully 
exploited and the underlying unity of mathematics is exhibited. It has been 
deemed desirable and practicable to introduce topics, formerly reserved for 
more advanced courses, at an earlier stage, e.g. the concepts and language 
of set theory are now widely recognized as providing an excellent medium 
for promoting the understanding and appreciation of a wide range of 
mathematical topics. 

This volume presents a number of modern topics suitable not only for 
students who are proceeding to the study of mathematics, science or 
engineering but also as a programme of general education on the reasonable 
assumption that modern mathematics can make an important contribution 
in the field of liberal education. Whilst the presentation of these topics is 
non-traditional, students, whose earlier mathematical education has pro- 
ceeded on traditional lines, should experience no handicap. 

Volume I is based upon a series of lectures given by the author a few 


vi preface 


years ago to all undergraduates in the University of Malawi, irrespective of 
their specializations, to introduce them to modern mathematical thinking 
and teaching as a form of liberal education. The author makes no claim for 
comprehensiveness though the basic topics of mathematics and certain 
interesting and important applications of the subject have been included in 
this volume. 

The two volumes, comprising this textbook, have not been written to 
cover any specific syllabuses. It is, however, claimed that they contain 
logically developed expositions of the more important topics of modern 
mathematics which have gained prominence in a wide range of examination 
syllabuses. 

Volume | deals with the structure of the real number system and their 
representation on the number line. The study of inequalities, ordered pairs, 
relations and functions is developed in terms of set theory which is itself 
considered in an early chapter. A chapter on non-decimal arithmetic is 
followed by an account of digital computers and the elements of program- 
ming. This volume concludes with chapters on linear programming and an 
introduction to matrices and vectors. This volume is suitable for Vth and 
VIth formers in secondary schools, particularly for those who are studying 
modern mathematics syllabuses and also for students in Technical Colleges 
who are pursuing O.N.D./C. courses in Engineering and Science into which 
modern topics are being increasingly introduced. 

Volume 2 contains subject matter of a more advanced standard including 
vector differentiation and integration, probability theory, Boolean algebras, 
and group theory. Since these topics are included in H.N.D./C. and degree 
courses in Mathematics, Science and Engineering, this volume is suitable for 
students pursuing such courses in Technical Colleges, Polytechnics and 
Universities. 

In conclusion, I wish to thank Dr. Lee Peng Yee of the University of 
Malawi and Mr. R. W. Boxer for their helpful comments and suggestions. 


T. D. H. Baber 


Contents 


Preface 


1 Geometry of vectors 


1 A vector as an equivalence class - | 
2 Vectors and scalars - 2 
3 Vector algebra - 3 
4 Position vectors - 5 
5 The ratio theorem - 6 
6 Centroids -7 
7 Rectangular unit vectors - 13 
8 Components of a vector - 14 
9 Equation of the straight line through a given point, parallel to a given 
vector * 17 
10 Equation of the straight line through two given points - 17 
11 Equations of the bisectors of the angles between two unit 
vectors localized at a given point - 18 
12 Equation of the plane through a given point, parallel to two given 


vectors - 20 
13 Equation of the plane through three given points - 21 
14 Linear dependence of vectors - 22 


2 Scalar and vector products 


1 Scalar product of two vectors - 25 
2 Equation of a plane - 33 
3 Perpendicular distance of a point from a plane - 34 


Vili contents 


> 


The equations of planes bisecting the angles between two given 
planes - 35 

Vector area - 35 

Vector rotation - 36 

Vector product of two vectors - 38 


~ YAU 


roducts of three vectors 
Scalar triple product - 47 
Equation of the plane through three non-collinear points - 49 
Equation of the plane through a given line and parallel to another 
line - 50 

11 The common perpendicular to two skew lines - 50 

12 Vector triple product - 54 


own 


Products of four vectors 
13 Scalar product of four vectors - 56 
14 Vector product of four vectors - 56 


3 Differentiation of vectors 


Scalar and vector fields - 62 

Derivative of a vector - 62 

Curves in space - 65 

Derivatives of scalar and vector products - 68 
The elements of differential geometry - 73 
Equations of tangent, normal and binormal - 76 


ANnNhwnre 


4 Vector integration 


Integral of a vector function of a scalar - 87 
Line integrals - 91 

Surface integrals - 95 

Normal to a surface - 97 

Volume integrals - 101 


UhwWN— 


Dynamics of a particle 

6 Linear momentum; impulse; activity; kinetic energy - 106 

7 Motion of a particle under gravity - 108 

8 Moment of momentum (angular momentum) - 109 

9 Central forces: 110 

10 Planetary orbits - 111 
11 Motion under a central force directly proportional to distance - 114 


5 Probability theory 


Introduction - 120 

Outcomes and events - 120 

Sample points and sample space « 122 
The probability of an event - 124 


hWNe 


contents ix 


Combinatorial formulae - 125 

Expectation or expected value - 132 

The expectation of functions of a random variable - 135 

Joint probability distribution functions - 137 

Probability generating functions - 139 

The probability generating function of the sum of independent 
random variables - 140 

Binomial and multinomial theorems - 143 

Independent trials with two outcomes - 144 

The probability generating function of the binomial distribu- 
tion - 145 

Combinatorial generating functions - 148 

Compound probability - 152 

Conditional probability - 154 

Bayes’ theorem - 163 

Tree diagram - 168 

Independent trials - 173 

Markov chain processes « 177 

Linear transformation of vectors - 182 


6 Boolean algebras 


AUuUhWN 


Introduction - 190 

The laws of the algebra of sets - 190 

The laws of Boolean algebra - 192 

Binary Boolean algebra - 195 

Switching circuits - 198 

The algebra of the logic of statements - 204 


7 Residue classes 


1 
2 
3 
4 
5 
A 


Congruences « 211 

Algebra of residue classes « 215 

Division of residues - 216 

Arithmetic prime modulo - 218 

Arithmetic nonprime modulo - 219 
ppendix: Euclid’s algorithm - 222 


8 Groups 


AuUuUkhwnre 


Binary operations - 224 

The nature of groups - 225 
The order of a group - 230 
Notation - 231 

The inverse of a product - 231 
The index laws - 231 


7 
8 
9 
10 
11 
12 
13 
14 
15 


Isomorphism - 231 

Permutation groups - 234 

The order of an element of a group - 237 
Cyclic group - 239 

Transformation groups - 241 

Subgroups - 249 

The centre of a group - 250 

Cayley’s theorem - 251 

Lagrange’s theorem - 253 


Index + 258 


contents 


1 Geometry of vectors 


1.1 A Vector as an Equivalence Class 


In Chapter 9 of volume 1, two- and three-dimensional vectors have been 
represented by (2 x 1) and (3 x 1) matrices respectively and various 
transformations in a plane have been studied in terms of such matrices. We 
now consider vectors from the geometrical point of view. 

Consider the set of all straight lines of various directions and lengths, i.e. 
the set of directed line segments. The equivalence relation “thas the same 
direction and length” partitions this set of line segments into equivalence 
classes, each class containing all those line segments which have the same 
length and direction. ma? 

Each equivalence class now defines a vector. If the directed line AB is an 
element* of a particular equivalence class, this class may be said to represent 
the vector AB. ayn A+ 

Two vectors AB and CD are equal, i.e. AB = CD if AB and CD have the 
same direction and equal lengths. 

We have seen that two vectors may be combined or “added” to form a 
third vector by a parallelogram (or triangle) law. Let AB = A’B’, both 
directed lines being drawn from the vector class AB. Similarly let BC = 
B’C’; then we havet 


AB+BC=AC and A’B’+BC=AC 


* The bar indicates that AB represents a displacement of magnitude AB in the 
direction from A towards B. 


+ The + sign here is used to denote vector addition. 


2 geometry of vectors 


Figure 1.1 


It is obvious that the triangles ABC, A’B’C’ are congruent and that corre- 
sponding sides are parallel. Therefore 


AC =A’ 
so that both are drawn from the vector class AC. 

It follows that if any pair of elements be drawn one from each of two 
vector classes, their sum will always be an element of a third unique vector 
class. The third vector is defined to be the vector sum of the first two vectors. 

From the parallelogram in Fig. 1.1, we have 

AD + DC = AC 
1.¢. 
BC + AB = AC = AB + BC 
so that vector addition is commutative. 


If, in Fig. 1.1, AB is represented by (3) and BC by (‘) , then AC will be 
represented by (; 4, i) ; 


Thus an isomorphism exists between (2 x 1) matrices under addition and 
vectors under a parallelogram law of addition. 


1.2 Vectors and Scalars 


A vector is a quantity which has both magnitude and direction, e.g. displace- 
ment, velocity, acceleration, force. 


geometry of vectors 3 


O 
Figure 1.2 


Notation 
The vector OP (Fig. 1.2) will be represented by OP and its magnitude by 
|OP|. It is often convenient, however, to represent the vector OP by A and 
its magnitude by |A| or simply A. 

A unit vector is a vector of unit magnitude. Thus A/A is a unit vector 
having the direction of A (A 4 0). 

A scalar is a quantity which has magnitude but no direction, e.g. any real 
number, mass, length, time, temperature. Scalars will be denoted by ordinary 
letters like m, n. 


1.3 Vector Algebra 


The familiar operations of addition, subtraction and multiplication in 
ordinary algebra may, by suitable definition, be extended to develop an 
algebra of vectors. 


DEFINITION 1.1 
Two vectors A and B are equal, i.e. A = B, if they have the same direction 
and magnitude. 


DEFINITION 1.2 

The zero, null or neutral vector is a vector which is the neutral element for 
vector addition. It will clearly be a vector of zero magnitude and will be 
denoted by 


0 and A+O0=A 


DEFINITION 1.3 
A vector having a direction opposite to that of A but having the same magni- 
tude will be denoted by —A. 


4 geometry of vectors 


ADDITION 


Vectors are added by a parallelogram law and 
A+B=B+A and A+(—A)=0 
The addition of vectors is associative, that is 
A+ (B+ C)=(A+B)+C 
In Fig. 1.3, 


OP+PQ=O0Q=(A+B) and OQ+QR=OR=D 


Figure 1.3 


Therefore 

(A+ B)+C=D 
Similarly, 

PQ + QR=PR=(B+C) and OP + PR = OR=D 
Therefore 

A+(B+O=D 


and the result follows. 


geometry of vectors 5 


SUBTRACTION 


The difference of two vectors A and B is that vector which when added to B 
gives A (Fig. 1.4). This is equivalent to defining A — B as the sum of A and 
—B, i.e. 

A—B=A-4 (—B) 


IfA=B then A—B=0 


=B 
Figure 1.4 


MULTIPLICATION BY A SCALAR 
The product of a vector A by a scalar m is a vector mA whose magnitude is 


m times that of A. It has the same direction as A if m > 0 and has the 
opposite direction to A if m < 0. If m = 0, mA is a null vector. 


LAWS OF VECTOR ALGEBRA 


(1) A+B=B+A Commutative Law of Addition 

(2) A+ (B+ C)= (A+ B)+C Associative Law of Addition 

(3) mA = Am Commutative Law of Multiplication 
(4) m(nA) = (mn)A Associative Law of Multiplication 


(5) (m+ n)A=mA+nA 


Distributive Laws 
(6) m(A + B) = mA + mB 


1.4 Position Vectors 


For a given origin O, the position vectors of points A, B, C,..., with 
respect to O are OA, OB, OC,.... It will often be convenient to use the 
abbreviations a, b, c,. . . , respectively, for their position vectors. 


1.5 The Ratio Theorem 


geometry of vectors 


We now find the vector equation of the straight line through two points A 
and B whose position vectors with respect to O are a and b respectively 


(Fig. 1.5). 


Figure 1.5 


Let R be a point in AB such that OR = r. 

AR=r—a_ and AB=b—a 

Since AR and AB are collinear, AR = tAB, and 
r—a= t(b — a) 

Thus the vector equation of AB with respect to O is 
r=a(l—?t)+bt 

This equation may be written 
r=da-+ pb 


where 1 + wp = 1. 


RB “AB ARO (POUR cg gy 


(1.1) 


geometry of vectors 7 


Thus eqn (1.1) gives the position vector of a point R which divides AB in the 
ratio w:/. (This ratio will be positive or negative according to whether R 
divides AB internally or externally.) 

The symmetric form of this theorem may be obtained by noting that, since 
AR and RB are collinear, there are scalars m, n such that 


mAR = nRB 
Le: 
m(r — a) = n(b — r) 


p= mat nb (1.2) 
m+n 


m 
This result may also be obtained by writing A = p= 
eqn (1.1). m+n mn 
Equation (1.2) may be written 
(m + n)r — ma — nb=0 
in which the sum of the coefficients of r, a and b is zero. If A, R, B are distinct 


collinear points, none of these coefficients is zero. Thus if A, R, B are distinct 
collinear points, numbers /, m, n exist, different from zero, such that 


Ir + ma+ nb=0 lt+m+n=0 (1.3) 


and conversely. 


1.6 Centroids 


The centroid of two particles of masses m,, mz placed at A and B respectively 
is a point R on AB such that AR/RB = m,/m,. 


m, 


Figure 1.6 


8 geometry of vectors 


In applying the principles of mechanics, the two particles may usually be 
replaced by a single particle of mass (m, + mg) at R. 
Using the symmetrical form of the ratio theorem, we have 
(m, + me) = ma + mab 


If there is a third particle of mass mg at C (Fig. 1.7), the centroid G of the 
three particles may be found as the centroid of mass (m, + m,) at R and 


Figure 1.7 


mass mz, at C and will be given by 
(my, - Ms + Ms)Z — (m, + m,)r + mc = mya + mab + mC 
G is clearly in the plane defined by the positions of the three particles but the 


origin O may be anywhere. 
By continued application of the above procedure, it easily follows that the 


centroid G of n particles of masses m, mz, M3,..., mM, at points Py, Pe, 
P;,..., P, respectively is given by 
n n 
(Sm) OG = > m,OP, (1.4) 
r=1 r=1 


It is often convenient to refer to G as the centroid of points P;, P2,..., P, 
with associated numbers m,, mz,..., mM, Tespectively. Negative masses or 
numbers cause no difficulty: for example, the centroid G of points A and B 
with associated numbers m, and —m, respectively (m, > mz > 0) is given by 


(m, — mo)r = mya — mab 


and G divides AB externally such that AG/GB = mz/m, [cf. eqn (1.2)]. 


geometry of vectors Gy 


The centroid G is determined uniquely by eqn (1.4) irrespective of the 
choice of O. If this were not the case, let G’ be the centroid as determined 
from another origin O’. Then 


(Sm) O'G' = > m,O'P, 
r=1 


r=1 


Therefore 
( 3m) (0G — 0'G') = }m,(OP, — O'P,) =( > m,) 00’ 
r= rT r=1 
Since > m, #0, OG — O'G’ = OO’, then 
r=1 


OG = O'G’ + OO’ = OG’ 


The ratio theorem, eqn (1.2), and its extension to centroids, eqn (1.4), are 
most useful in proving many theorems in geometry. 


EXAMPLE 1.1 
Show that the medians of a triangle are concurrent. 


Solution Let G be the centroid of equal masses placed at the vertices A, B, 


A 


B D Cc 
Figure 1.8 


C of the triangle and let D be the mid-point of BC (Fig. 1.8). 
g=}a+b+c)=ja+ §4(b+ 0) = ta + id 


By the ratio theorem, G lies on the median AD such that AG/GD = 2/1. 

Similarly the centroid G lies at the points of trisection of the other two 
medians. Hence the medians are concurrent at G which is called the centroid 
of the triangle. 


10 geometry of vectors 


EXAMPLE 1.2 
Show that the altitudes of a triangle are concurrent. 


Figure 1.9 


Solution Let the circumcentre of the triangle ABC in Fig. 1.9 be chosen as 
origin O and let h=a+b-+e. Since |a] = |b] = lc], the parallelogram 
OCDB, having the vectors b and ¢ as a pair of sides, is a rhombus. 
Therefore, OD is perpendicular to BC. 
Therefore, vector (b + ¢) is perpendicular to vector (b — c) and therefore 
Vector (h — a) is perpendicular to vector (b — ¢), and AHis perpendicular 
to BC. 
Similarly, H is on the other two altitudes of the triangle ABC, so that the 
three altitudes are concurrent at H which is called the orthocentre. 
Also since g = }(a + b + c) = th, G lies on OH such that OG = 40H. 


EXAMPLE 1.3 
Show that the internal bisectors of a triangle ABC are concurrent at I, the 
centroid of masses sin A, sin B, sin C at A, B, C respectively. 


A (sin A) 


B (sin B) C (sinc) 
Figure 1.10 


geometry of vectors 11 


Solution Take A as origin. Then, if AC = b and AB = ¢, from Fig. 1.10, 
(sin A + sin B + sin C)AI = csin B + bsinC 


sh ee a I 
c b 


ec ab 
= (+9) 


where p is the altitude of the triangle through A. 

Since ¢/c and b/b are unit vectors in the directions AB and AC respectively, 
it follows that (c/c) + (b/b) is a vector in the direction of the internal bisector 
of the angle A. Thus AI has the direction of this bisector, so that the 
centroid I lies on this bisector, and similarly lies on the other two internal 
bisectors. 

Therefore, the internal bisectors of a triangle are concurrent at I, the 
incentre. 


EXAMPLE 1.4 

Prove that the straight lines joining the vertices of a tetrahedron to the 
centroids of the opposite faces (the medians of the tetrahedron) are con- 
current. 


Figure 1.11 


12 geometry of vectors 


Solution Let G be the centroid of unit masses placed at the vertices A, B, 
C, D of a tetrahedron (Fig. 1.11). Let P, P’ be the mid-points of AB, CD 
respectively. Also let Q, Q’ be those of BC, AD, and R, R’ those of AC, BD 
respectively. Let G, be the centroid of triangle BCD. Then 


g=iat+b+c+d)= 3+ b) ++] = hp +p) 


Similarly g = $(q + q’) = }(r +r’). It follows that G is the point of 
concurrence of the lines PP’, QQ’, RR’ which join the mid-points of opposite 
edges of the tetrahedron. Also 


g=ta+ 2.4(b+c+d) = ja 3g, 


Therefore G lies on AG, and divides AG, so that AG:GG, = 3:1. Thus 
G is also the point of concurrence of the four medians of the tetrahedron 
so that seven lines meet at G. 


EXAMPLE 1.5 

A transversal cuts the sides AB, BC, CA of the triangle ABC in the points 
D, E, F respectively. Show that the product of the ratios in which D, E, F 
divide these sides is —1 (Menelaus’ Theorem). 


Figure 1.12 


Solution Let E divide BC in the ratio /:m (Fig. 1.12) and let F divide CA 
in the ratio n:/ so that 


(1 + mje = Ic + mb and (n+ )f=na-+ Ie 


By subtracting, eliminate c so that 


(1+ mje—(n+Df _ mb—na 
m—n m—n 


geometry of vectors 13 


By the ratio theorem, eqn. (1.1), each of these fractions is equal to the position 
vector d since D lies in both FE and AB. 

But clearly, from (mb — na)/(m —n) = 4d, D divides AB in the ratio 
m:—n. Therefore -™ 


, ee” ae TE 
Product of the three ratios is = x T >< = —1. 


EXERCISES 1.1 
1. Prove that the line joining the mid-points of two sides of a triangle is parallel to 
the third and has one half of its length. 


2. If O is any point inside triangle ABC and P, Q, R are the mid-points of its sides, 
show that 
OA + OB + OC = OP + OQ + OR 


Show that the result also holds if O lies outside triangle ABC. 


3. The mid-points of the consecutive sides of any quadrilateral (skew or otherwise) 
are joined. Show that the resulting quadrilateral is a parallelogram. 


4. Show that the orthocentre of a triangle ABC coincides with the centroid of 
masses tan A, tan B, tan C at A, B, C respectively. 


5. Show that the circumcentre of triangle ABC coincides with the centroid of 
masses sin 2A, sin 2B, sin 2C at A, B, C respectively. 


6. If G, G’ are respectively the centroids of triangles ABC, A’B’C’ show that 
3GG’ = AA’ + BB’ + CC’ 
7. Pand Q are the mid-points of the sides AB, BC respectively of the parallelogram 


ABCD. Show that DP and DQ trisect the line AC. Prove also that AC passes 
through a point of trisection of DP and DQ. 


8. Let D be the point on the side BC of triangle ABC such that BD/DC = n/m 
and let R divide AD such that AR/RD = (m + n)/I. Show that 


r = (la + mb + ne)/(1 + m + n) 


If D, E, F are points on BC, CA, AB respectively such that AD, BE, CF are 
concurrent at R, deduce that 


BD CE AF 


ae Ex 1 (Ceva’s theorem) 


1.7 Rectangular Unit Vectors 


Using a right-handed system of axes (Fig. 1.13), unit vectors along Ox, Oy, 
Oz will be denoted by i, j, k respectively. 


14 geometry of vectors 


Figure 1.13 


1.8 Components of a Vector 


Let a vector A be localized at O (Fig. 1.14) and let the coordinates of the 
terminal point of A with respect to rectangular axes through O be (a, az, as). 


> 
Zz 


Figure 1.14 


The vectors a,i, a,j, ask are called the rectangular component vectors of A 


with respect to the axes and 4, dz, a, are called its rectangular components 
with respect to the axes. 


A = ai + aj + ajk 


and the magnitude of A is A = ,/(a,? + a2 + a,?). 
If r is the position vector of the point (x, y, z) with respect to O, 


r=xi+yjt+zk and r= /(x8+ y*+ 24) 


geometry of vectors 15 


R (x,y,z) 


16) A x 
Figure 1.15 


Let the vector r make angles «, 6, y with the positive directions of the 
coordinate axes Ox, Oy, Oz respectively (Fig. 1.15). 
From the right-angled triangle OAR, 


cos a = OA/OR = x/r 
Similarly cos 8 = y/r and cos y = z/r. Therefore 


cos” a + cos* B + cos? y = (x* + y? + z*)/r* = 1 


cos a, cos B, cos y are called the direction cosines of the vector OR. 
Let a and b be two non-collinear vectors localized at O (Fig. 1.16). Let r be 
the vector OR in the plane determined by a and b. Through R draw BR, AR 


R 


Oo A 
Figure 1.16 


16 geometry of vectors 


parallel to a, b respectively then 
r = OA + OB = xa-+ yb 
where x, y are suitable scalars. 
xa and yb are said to the components of r in the directions of a and b 
respectively and a and b are called base vectors in the plane. 
Similarly a vector r in three-dimensional space may be expressed in terms 
of three non-coplanar base vectors a, b, ¢ (Fig. 1.17). 
r= OA + OB + OC = xa + yb + ze 


where x, y, z are suitable scalars. 


R 


4 


oO A 78 
Figure 1.17 


If xa + yb + ze = 0, then x = y = z = 0, for suppose x + 0, then 


so that a is a vector in the plane of b and c, contrary to hypothesis. Therefore 
x = 0 and similarly y = z = 0. 
It follows that if x,;a + y,b + z,¢ = x,a + yb + z¢, so that 
(%1 — x2)a + (V1 — Ya)b + (21 — 22)e = 0 


then x, — x, = 0 and x, = xp. Similarly y, = yp, 2; = Zp. 


geometry of vectors 17 
1.9 Equation of the Straight Line through a Given Point, 
Parallel to a Given Vector 


Let the required line pass through point A to be parallel to vector b. Let R 
be a point on the line (Fig. 1.18). Then 


AR = tb 
r= OA-+ AR=a + tb 


A R 
(a, @,,4,) tb (x, y, 2) 


0° 
Figure 1.18 


If R and A are respectively the points (x, y, z) and (a, as, as) and b has the 
components (b,, bz, bs) then 


xi + yj + zk = (ai + a,j + agk) + t(d,i + b,j + bk) 
Therefore 


X—@, 


x=a,+ tbh, b 
1 


t 


Thus the Cartesian equation of the line is 


1.10 Equation of the Straight Line through Two Given Points 
A and B are the two points (Fig. 1.19). 


AB=b—a 


18 geometry of vectors 


A B R 
(a,, a, a;) (b,, b,, b;) (x, y, z,) 
r 
a 
te) 
Figure 1.19 
Therefore 
AR = t(b — a) 


r= OA + AR =a + t(b— a) 


r=(1—f)a-+ tb 
(xi + yj + zk) = (Qi+ @2j + a,k) 
+ t[(b, — a,)i + (6, — a,)j + (bs — ag)k] 


so that the Cartesian equation of the line is 


1.11 Equations of the Bisectors of the Angles between Two 
Unit Vectors Localized at a Given Point 


Let R be a point on the internal bisector between the unit vectors a and b 
(Fig. 1.20). Let B’R and A’R be drawn parallel to a and b respectively. 


|OA’| = |A’R| = |OB’| so that OA’=ta and A’R= tb 
But 


r= OA’ + A’R 


geometry of vectors 19 


Figure 1.20 


and therefore 
r = t(a + b) 


Similarly the equation of the external bisector CR’ is r = t(a — b). Ifa and b 
are not unit vectors, the equations of the two bisectors are 


a’: b 

ys 23) 

(* b 
EXAMPLE 1.6 


AR, the internal bisector of the angle BAC of triangle ABC, meets BC in R. 
Show that BR/RC = c/b. 


R’ B R iC 
Figure 1.21 


Solution Taking A as origin, the equation of the internal bisector of angle 
BAC is 


20 geometry of vectors 
If the scalar ¢ is given the particular value bc/(b + c), then 


, — fetch 
b+ec 


is a point on AR. By the ratio theorem, it must be a point on BC and is 
therefore the point R. Thus R is the centroid of B and C with associated 
numbers b and c respectively. Thus BR/RC = c/b. 

Similarly, since the external bisector of angle BAC is 


(be — cb)/(b — c) is a point R’ on it which divides CB externally such that 
BR’/R’C = ¢/b. 


EXAMPLE 1.7 
Show that the internal bisectors of the angles of a triangle are concurrent. 


Solution The centroid of the points A, B, C with associated numbers a, b, c 
respectively is a point I on AR (Fig. 1.21) such that AI/IR = (6 + o)/a. 

Similarly the centroid I must be on the internal bisectors of the angles 
ABC, ACB. 

Therefore the three internal bisectors are concurrent at I. 

Similarly, it may be shown that the internal bisector of the angle BAC and 
the external bisectors of the other two angles are concurrent at the centroid 
of the points A, B, C with associated numbers a, —b, —c respectively. 


1.12 Equation of the Plane through a Given Point, Parallel 
to Two Given Vectors 


Let A be the given point. Let R be a point in the plane (Fig. 1.22), and b and 
c the vectors. Since AR is parallel to b and ¢, 


AR = sb + te and OR = OA + AR 
Therefore 
r=a+sb+ te 


As R moves in the plane, the scalars s and ¢ take various values. 


geometry of vectors 21 


Figure 1.22 


1.13 Equation of the Plane through Three Given Points 


The required plane passes through A, B, C (Fig. 1.23) and is parallel to the 
vectors AB = b — a and AC = c — a0 that the required equation is 


r=a-+ s(b—a)+ ¢(c— a) = (1 —s —f)a+ sb+te 


(It is assumed that the three vectors a, b, c are not coplanar.) 
Thus four points R, A, B, C are coplanar if 


(l—s—ta+sb+te—r=0 


Figure 1.23 


22 geometry of vectors 


In this equation, the sum of the coefficients of the four vectors is zero (cf. 
the condition, eqn (1.3), for the collinearity of three points). 
This result may be written in the equivalent form 
la + mb + ne + pr =0 (1.5) 


where / + m-+n-+ p=0, and /, m, n, p are not all zero. 


1.14 Linear Dependence of Vectors 


(i) Let a, b, ¢ be three distinct* coplanar vectors (Fig. 1.24). Let OC cut AB 
in R. Then by eqn (1.3), scalars /, m, n exist, each different from zero, such 
that 


la +- mb + nr = 0 1+m+n=0 


Cc 


Figure 1.24 


But c = fr where ¢ is a scalar and therefore 
aa + Bb + ye =0 a, B, y 40 
In the special case when OC is parallel to AB, 
OC = /AB sothat c=A(b— a) where isa scalar 


Thus when three distinct vectors a, b, ¢ are coplanar, they satisfy a linear 
relationship of the form 


aa + Bb + ye =0 (1.6) 
and are said to be linearly dependent. 


* The word “distinct” implies that no two vectors have the same direction. 


geometry of vectors 23 


(ii) Let a, b, c, d be four distinct vectors in space, of which no three lie in 
the same plane. 
Let OD meet the plane containing A, B, C in R (Fig. 1.23). Then by 


eqn (1.5), 
la + mb + ne + pr = 0 1+m+n+p=0 
But d = ¢r and therefore 
aa + Bb + ye + dd = 0 (1.7) 


a, B, y, 6 are not all zero but some may be zero. 
In the particular case, when OD is parallel to the plane ABC, 


d = A(b — a) + w(c — a) 


which is of the same form as eqn (1.7). 

When four distinct vectors satisfy an equation of this form, they are said 
to be linearly dependent. Whilst three vectors in three dimensional space may 
be linearly independent, four vectors must be linearly dependent. 


EXERCISES 1.2 
1. A and B are given vectors. Show that 


(i) |A +B] <|A| + |B] = Gi) [A — B] > |A| — [BI 

2. If P and Q are the points (x;, y,, 21) and (x2, yz, Z2) respectively, find the 
magnitude of the vector PQ. If P is the point (2,3, —1) and Q the point 
(4, —3, 2) show that PQ is the vector 2i — 6j + 3k having a magnitude 7. 

3. Show that equations of the straight line through the points (0, —2, 3) and 
(1, —2, 1) are 2x + z = 3; y =z. Show that this straight line meets the 
plane determined by the origin and the points (2, 4, 1) and (4, 0, 2) in the point 
(6/5, 2, 3/5). 

4. Show that the centroid of A (—2,2, —1), B (2, —1,3), C (—2,4,1), D 
(1, 2, 3) with associated numbers 1, 2, 3 and 4 respectively is (0, 2, 2). 

5. Show that the vectors A = —i + 3j + 4k, B = 3i+ j — 2k, C = 4i — 2j — 
6k are coplanar and show that the lengths of the medians of the triangle ABC 
are $,/386, $,/14, ./74. 

6. With distances measured in nautical miles and speeds in knots, three ships are 
observed from a coastguard station at half-hour intervals. They have the 
following distance (s) and velocity (v) vectors: 


$s, = 2i+ 6) and vy, = Si+ 4jat 1200 
Ss = 61+ 9j and y, = 4i + 3jat 1230 
Ss, = 1li+ 6j and vy, = 2i+ 7j at 1300 


24 


10. 


12. 


13; 


geometry of vectors 


Prove that if the ships continue with the same velocities, two of them will 
collide, and find the time of collision. If at that instant the third ship changes 
course and then proceeds directly to the scene of collision at its original speed, 
find at what time it will arrive ( U.L.A-level), 

[Time of collision 14 20; ay 29 minutes after the collision] 


. If the position vectors of points P,, P,,..., P, with respect to an origin O are 


T1,T2,---,, respectively and scalars ky, ko, ..., k, exist such that 
kyr, + kore + Sa + kyl, =0 


then show that this result will be independent of the origin if, and only if, 
ky +ky+-+:++k, =0. 


- The position vectors of three points A, B, C are respectively 


a=2i-44-3k b=-i+2j+2k ec=3i-2j+k 
Express the vector d = —5i + 2j — k asa linear function of a, b, c. 
What point on d is in the plane ABC? 
[3a + Sb — 2c; (—5/6, 1/3, —1/6)] 


. a, b, c are non-coplanar base vectors. Show that the vectors r, =3a+b-—e, 


tT, = —Sa + 2b — 3c and rs = 36a + b + 2c are linearly dependent. 

[rs = 7r, — 3r.] 

Show that the vectors 

(a) a =i — 4k, b = 4i + 3j — k, ec = 2i + j — 3k are linearly dependent. 
[2a + b = 3c] 


(6) a= 3i+ 2j-—k, b =i—3j +k, c = 2i+j -—3k are linearly inde- 
pendent. 


. Show that 


4 NW A 

X2 Yo Z| #0 

*3 Ys 23 
is a necessary and sufficient condition that the vectors a = 4 + yj + 2k, 
b = xgi + yo} + zok, ¢ = xgi + yg + zgk shall be linearly independent. 


Show that the equation of the plane passing through the points A, B, C may be 
written 


r= (/a + mb + ne)/(] + m + n) 


and verify that this equation is independent of the origin. 


Show that the mid-points of the six edges of a cube which do not meet a given 
diagonal] are coplanar. 


. The triangles ABC, A’B’C’ are such that AA’, BB’, CC’ are concurrent. AB and 


A’B’ meet at X and the other pairs of corresponding sides of the triangles meet 
at Y and Z. Show that X, Y, Z are collinear (Desargues’ Theorem). 


2 Scalar and vector products 


2.1 Scalar Product of Two Vectors 
The work done by a force F when its point of application is displaced from 
O to R (Fig. 2.1), i.e. is given a displacement r, is given by 

Work done = Fcos @ X r 


O r R 
Figure 2.1 
i.e. the product of the magnitudes of the two vectors F and r and the cosine 


of the angle between their directions. 
This scalar quantity is said to be the scalar product of the vectors F and r. 


DEFINITION 2.1 
The scalar product of two vectors A and B is AB cos 0, where 6 is the angle 
between the directions of A and B, and is denoted by A.B (Fig. 2.2). 


26 scalar and vector products 


Thus A.B = AB cos 6 = B.A (0 < 0 < 7). 


6\ 


Figure 2.2 


If A and B are perpendicular vectors, then A.B = 0. When A = B, we have 
A.A = A?, the square of A. 
The square of a unit vector is therefore 1. 


i=? = 1 and §ii=jj=kk=1 (2.1) 
Since i, j, k are mutually perpendicular, 
ij=jk—ki=0 (2.2) 
If m is a scalar, 
m(A.B) = mAB cos 0 = (mA).B = A.mB 


Let b be a unit vector in the direction OB and let PQ be parallel to OB (Fig. 
2.3). The projection of the vector A on OB is 


ON = PQ= Acos@=A.b 


Let r be the vector from the origin O of rectangular coordinates to the point 
(x, y, z). Then 


x = the projection of r on Ox = r.i 
Similarly y = r.j and z = r.k. Therefore 


r= xi+ yj + zk = (i+ (rj + (WK 


scalar and vector products 27 


Figure 2.3 


Note that the scalar products here are the scalar coefficients of the vectors 
i, j, k. 


DISTRIBUTIVE LAW 
With reference to Fig. 2.4, let C be a vector in the direction PQR and let c 
be a unit vector in the same direction. The projection of A + B in the 
direction PQR is equal to the sum of the projections of A and B in this 
direction. Therefore 

(A + B).c = A.c + Buc 
On multiplying by C, we have 


(A + B).Cc = A.Ce + B.Ce 


28 scalar and vector products 


i.e. 
(A + B).C = A.C + B.C (2.3) 


so that the distributive law holds for scalar products. 
By repeated applications of this result, it follows that the scalar product 
of two sums of vectors may be expanded as in ordinary algebra, e.g. 


(A + B).(C + D) = (A + B).C + (A + B).D 
= A.C + B.C + A.D + B.D 


Let A and B be expressed in terms of their rectangular components, then 


A.B = (Ai + Aaj + Ask).(Bii + Boj + Bk) 
= A,B, + A,B, + A3Bs; (2.4) 
Thus A = ,/(A,? + A? + As). 


Since A.B = AB cos 0, it follows that the angle 0 between the directions 
of A and B is given by 


A,B, + A2Bs + A3Bs 
V(Ai® + Ae? + As"),/(B," + Bs” + Bz’) (2.5) 
= Il, + mym, + mn 


cos § = 


where (/;, 71, 7), (/, m2, Mg) are the direction cosines of the two vectors, i.e. 
Lb = Cos 6, = A,/|Al = Aj], J (A? + As + As) Cie. 


EXAMPLE 2.1 
(A) Find the angle between the vectors A = 3i + j — k and B = 2i — 3j + 5k 


A.B 3.2 + 1.(—3) + (—1).5 —2 


cos 0 = —0.0978 @ = 95°37’ 


(B) What is the work done by a force F = 4i + 3j — k in moving a particle 
along the vector r = 3i + j — 2k? 


Work done = Fir = 4.3 + 3.1 + (—1)(—2) = 17 


scalar and vector products 29 


(C) Prove the formulae: 
(i) b=ccos A +acosC 
(ii) a® = b? + c? — 2be cos A for a triangle ABC (Fig. 2.5) 


A b C 


Figure 2.5 


(i) b=c-+a_ therefore b.b = c.b + a.b 

b? = ch cos A + abcos C b=ccosA+acosC 
(ii) a=b—c therefore aa = (b — c).(b—c) 

a’? = b? + c? — 2b.c = Bb? + c? — 2be cos A 


EXAMPLE 2.2 

Define the projection of any point in three-dimensional Euclidean space onto 
a plane through the origin. If u is the position vector of the point and p(u) 
denotes its projection on the plane, show that your definition can be expressed 
vectorially in the form 


p(u) = u — (un)n 


where n is a unit vector perpendicular to the given plane. 
Prove that p* = p. 


Two such projections p;, p, onto planes II, and II, through the origin, 
pu) =u—(un)n, —p(u) = u — (wan,)ng 
are given. Prove that p,p2, = P2p, if and only if II, and IT, coincide or are 


perpendicular. (Oxford and Cambridge: G.C.E. A-Level, S.M.P.) 


Solution Let R be a point whose position vector is u. Let N be the point in 
which a perpendicular from R to the plane meets it (Fig. 2.6). Let n be a unit 
vector normal to the plane in the direction NR. Projection of OR on the 
plane is 


ON = OR — NR = u — (u.n)n 


30 scalar and vector products 


R 


L 6 


Figure 2.6 


i.e. p(u) = u — (u.n)n 
Therefore 


P*(u) = p{p()} = {u — (u.n)n} — [fu — (u.n)n}.njn 
=u — (u.n)n — [(u.n) — (u.n)(a.n)]n 
=u — (u.n)n 


i.e. p?(u) = p(u). (It is otherwise obvious that the projection of ON on the 
plane is ON.) Now, 


Pip2(¥) = pifp2(u)} = u — (u.n,)n, — [{u — (u.n,)n,}.n,]n, 
= u — (u.n,)n, — (U.n,)n, + (U.n,)(n2.n,)n, 


Similarly P2pi(a) = u — (u.n,)n, — (u.n,)n, + (u.m,)(;.N,)N, 


Hence, pip, = Pap; provided that (u.n,)(m,.m,)n, = (u.n,)(M;.n,)N, 
i.e. if n; = n, when the planes coincide 
or if n,.n, = 0 when the planes are perpendicular. 


EXAMPLE 2.3 
Prove that the perpendiculars drawn from the vertices of a triangle to the 
opposite sides are concurrent. 


Solution Let the perpendiculars from A and B to the opposite sides meet in 
H (Fig. 2.7). 
Since AH is perpendicular to BC, we have (h — a).(b — c) = 0. 
h.b — a.b — h.c + a.c = 0 (i) 
Similarly (h — b).(e — a) = 0 


h.c — b.c — h.a + a.b = 0 (ii) 


scalar and vector products 31 


re A 
Figure 2.7 

On adding equations (i) and (ii), we have 
(h — c).(b — a) = 0 


so that HC is perpendicular to AB which proves the theorem. 


EXAMPLE 2.4 

Two pairs of opposite edges of a tetrahedron are perpendicular. Show that 
the third pair are also perpendicular. Show also that the sum of the squares 
of the lengths of the edges is equal to four times the sum of the squares of the 
lengths of the lines joining the mid-points of the opposite edges. 


Solution Let OA be perpendicular to BC (Fig. 2.8). Then 
a.(c — b) = 0 a.c = a.b (i) 

Let OB be perpendicular to AC. Then 
b.(c — a) = 0 b.c = a.b (ii) 


O 


A B 
Figure 2.8 


32 scalar and vector products 
Therefore 
ac=bec c(b—a)=0 (iii) 


Therefore OC is perpendicular to AB. 
Let M and N be the mid-points of AC and OB respectively. Then 


MN = ON — OM = $b — }(a+ c) = £{(b — cc) — a} 


Sum of the squares of the joins of the mid-points of opposite edges is 


i[{(b — c) — a}? + {(c — a) — b}* + {(a — b) — ¢}?] 
= t[(b — c* + (C— a}? + (a— bf +a? t+ b+ 8 
— 2{a.(b — c) + b.(c — a) + ¢.(a — b)}] 
= Hat + bP +e + (b— 0 + (C— a) + @— 
[by (i), (ii) and (iii)] 
= }[sum of the squares of the lengths of the edges] 


EXERCISES 2.1 
1, Calculate the work done by a force of 30 newtons whose line of action has 
direction cosines ($, 3, 3) in a displacement from the point (1, 3, 5) to the point 
(7, 9, 2) where distances are measured in metres. [120 joules] 


2. What is the projection of the vector A = 2i — 3j +k on the vector B = 
4i — 7) + 4k? [11/3] 


3. If a, b, ¢ are coplanar vectors and a is not parallel to b, show that 


ca a.b aa ca 


eb b.b ab c.b 
aa a.b 


a.b b.b 


a+ 


4. Prove that the diagonals of a rhombus intersect at right angles. 
. Show that the perpendicular bisectors of the sides of a triangle are concurrent. 
6. For the tetrahedron in Fig. 2.8, show that 


Nn 


OA? + BC? = OB? + CA? = OC? + AB? 


7. Show that, in a regular tetrahedron, the perpendiculars from the vertices to the 
opposite faces meet these faces in their centroids. Show that the angle between 


scalar and vector products 33 


two faces is cos“ 1/3 and that the angle between a face and an edge not in that 
face is cos 1/,/3. 


8. A tetrahedron OABC has a vertex O at the origin and adjacent edges OA, OB, 


OC are represented by the vectors a, b, ¢ respectively. If G is the centroid of the 
face ABC, prove that 


30G=a+b+e 


If the angles BOC, COA, AOB are «, f, y respectively and if the lengths of 
OA, OB, OC are a, b, ¢ respectively, prove that 


90G? = a® + b? + c? + 2bce cos a + 2ac cos B + 2abcos y 
Find also an expression for the cosine of the angle between AB and OC (U.L.A- 
level). 
[(a cos B — bcos «)(a®? + b? — 2ab cos y)-/2] 
9. Define the scalar product of two three-dimensional Euclidean vectors u, v. 
Deduce an expression for the angle between the vectors 
U = mi + Uj + ugk and v= 0,i + vj + vgk 


in terms of u,, Us, Ug, V1, Vg, V3. A regular tetrahedron has vertices O, A, B, C 
where O is the origin, and A, B, C have position vectors with respect to O 
given by 


OA=-i+j OB=ai+ds) OC=pi+q+rk 


Find numerical values of a, b, p, g, r given that a > 0 and r > 0. (Oxford and 
Cambridge: G.C.E. A-Level, S.M.P) 


[a = 4/3 — 1); b = 40/3 + 1); p = —3 + /3/63 9 =4 + /3/6; 7 = 3/3) 


2.2 Equation of a Plane 


Let the plane pass through a point A and let n be a unit vector perpendicular 
to the plane and having the direction from the origin O towards the plane 
(Fig. 2.9). Let R be any point on the plane so that AR = r — a is perpen- 
dicular to n. 

Therefore (r — a).n = 0. 

Since n has the direction of ON, it follows that an = ON = p, a positive 
number, which is the length of the perpendicular from the origin to the plane. 

Therefore the equation of the plane takes the form 


r.n = p (normal form) (2.6) 
or 


Ix + my-+nz=p (2.7) 


in Cartesian coordinates where /, m, n are the direction cosines of the normal 
ON. 


34 scalar and vector products 


Figure 2.9 


The angle of inclination of two planes is the angle @ between their normals. 
Therefore 


cos 6 = N,N, —= Li, + MyM + NyNe (2.8) 


2.3 Perpendicular Distance of a Point from a Plane 
Let NM (Fig. 2.10) be the trace of the plane r.n = p. Suppose we require the 
perpendicular distance R’S = d of the point R’ from the plane NM. 

Let N’M’ be a plane through R’ parallel to NM. The equation of N’M’ is 


rn =p’ where p’ = ON’ 


oO 
Figure 2.10 


scalar and vector products 35 


Therefore r’.n = p’ and 
d=R’'S=NN=p-—p'=p-r'a (2.9) 


This quantity will be positive for points R’ on the origin side of the plane 
r.n = p and negative for points on the other side. 
In Cartesian coordinates 


d = p — (Ix’ + my’ + nz’) (2.10) 


2.4 The Equations of Planes Bisecting the Angles between 
Two Given Planes 


Let the two given planes be r.n, = p, and rn, = po. 

Any point on a bisector will be equidistant from these two planes. The 
perpendicular distances of any point on the bisector of the angle containing 
the origin will both be positive. Therefore the equation of this bisecting plane 
will be (eqn (2.9)) 


Pi = r.ny = P2 = r.M, 
i.e. 


r.(M, — Mz) = pi — Po (2.11) 


For any point on the other bisecting plane, the perpendicular distances will 
have opposite signs, so that its equation will be 


r.(n, + M:) = pi + Po (2.12) 


2.5 Vector Area 


A plane area may be represented by a vector. Let A be the magnitude of the 
area. Let n be a unit vector normal to this area. To distinguish between the 
two possible directions of n, the following convention is adopted. Let n have 
the direction of advance of a right-handed screw which is rotated in the 
direction PQR in which the boundary of the area is described (Fig. 2.11). 
This means that n will have the direction of OZ in a right-handed system of 
coordinate axes when the boundary of the area is described in a direction 
from Ox towards Oy in the first quadrant. 
The area A may then be represented by the vector area An = A. 


36 scalar and vector products 


Figure 2.11 


2.6 Vector Rotation 


A rotation may be represented by a vector whose direction is that of the axis 
of rotation and whose magnitude is that of the angle of rotation, say 0. The 
direction of this vector is that of a unit vector n as determined by the con- 
vention of Section 2.5 where the direction of the rotation is from Ox towards 
Oy in the first quadrant (Fig. 2.12). 

Thus the rotation 0 may be represented by the vector rotation 6n = 0. 


r4 


On 


0 
Figure 2.12 


EXAMPLE 2.5 


Find the equation of the plane through the point A (2, —1, 3) perpendicular 
to the line OB where B is the point (3, —2, —6). What is the length of the 


perpendicular from the origin to this plane? What is the perpendicular 
distance from the point (4, 3, —2) to the plane? 


Solution 
a=2i—j+3k b= 3i— 2j— 6k 
Equation of the plane is (r — a).b = 0, i.e. r.b = ab. Substituting 
(xi + yj + zk).(3i — 2j — 6k) = (2i — j + 3k).(3i — 2j — 6k) 
3x — 2y — 6z = 6+2—18 = —10 
3x — 2y —6z+ 10 =0 


scalar and vector products 37 


This equation will be converted to the normal form (see eqn (2.7)) upon 
division by \/{(3") + (—2)? + (—6)?}, i.e. by 7. 

Normal form is —3x + 2y + $z = 12. Thus the perpendicular distance 
from the origin to the plane is p = 10/7. 

Alternatively: 1 = (—3i + 2j + 6k)/7 


p=an = [(2i—j + 3k).(—3i + 2j + 6k)]/7 = 10/7 
Distance d of the point R’ [r’ = 4i + 3j — 2k] from the plane is given by 


j= 7 eae eae a ee 


EXAMPLE 2.6 
Find the dihedral angle between the two planes 
2x+6y—3z=10 Tx+4y—4z7=8 
Find also the equation of the plane which bisects the dihedral angle con- 
taining the origin. 
Solution The equations of the planes will be converted to the normal form 
on division by 7 and by 9 respectively. 
n, = [2i+ 6j—3k]/7 =n, = [7i+ 4j — 4k]/9 
n, — ny = 2,(—31i + 26j + k) 
Also p, = 10/7 and p, = 8/9. Therefore the equation of the plane bisecting 
the dihedral angle containing the origin is given by eqn (2.11): 
r.(my — Dy) = Pi — Po 
ie. (xi + yj + zk).(—31i + 26j + k)/63 = (10/7) — (8/9) 
—3lx + 26y + z = 34 


The dihedral angle 0 is given by eqn (2.8): 


a7 ee V5) (8 
Cc d6=n. == —— — (=) —_—— 
i Ag aateat (a) 9 63 


EXERCISES 2.2 
1. Find the perpendicular distance of the point (—1, 2, 3) from the plane 
2x —9y + 6z=12 [14/11] 


38 scalar and vector products 


2. Find the equation of a plane which passes through the point A (3, —1, 2) and 
is perpendicular to AB where B is the point (—5, 3, —1). What are the per- 
pendicular distances from the origin and from the point (2, —3, 5) to the plane? 
[8x — 4y + 3z = 34; 34/,/89; (—) 9/,/89] 

3. Find the equation of the plane passing through the origin and through the line of 
intersection of the planes 


ra=A rb = pu 
[r.(ua — Ab) = 0] 
4. Show that the equation of a sphere centre C and radius a is 


r? — 2r.c + c2 —a®=0 


where c is the position vector of C. 
5. Find the equation of a sphere on AB as diameter. 


[@ — a).(r — b) = 0] 


6. If ¢ is the position vector of the point (x9, yp, 9), obtain the equations of 
(i) a sphere, centre (Xo, yo, Zo), Of radius a, 
(ii) a plane perpendicular to c assing through (xp, yo, Zo), 
(iii) a sphere, centre (4%, $y, 3 passing through the origin. 
[@ |r — ¢| = a, (ii) & — c).c = 0, (iii) @& — ©).r = 0) 


2.7 Vector Product of Two Vectors 


Let F be a force localized in the line NR and let r be the position vector of 
the point R on this line with respect to a point O (Fig. 2.13). 

The moment of F about O has a magnitude F x ON = Fr sin 0. 

Let n be a unit vector normal to the plane of r and F such that r, F and n 
form a right-handed system (Fig. 2.14). 


Figure 2.13 


scalar and vector products 39 


Figure 2.14 


It is convenient to represent the moment or torque of the force F about O 
by the vector 


G = (Fr sin 6)n 
which is called the vector product of r and F and is denoted by r x F, i.e. 


G=r x F = (Frsin 0)n 


C=AxB 
AB sin@n 


BxA= -ABsin 6n 


Figure 2.15 


DEFINITION 11.2 

The vector product of two vectors A and B is a vector C = A x B (Fig. 2.15). 
The magnitude of C is defined to be AB sin 6 where 0 is the angle between 

the directions of the vectors A and B. The direction of C is perpendicular to 

the plane of A and B such that A, B and C form a right-handed system. 


40 Scalar and vector products 


Therefore 

C=AxB= ABsin On O<6<7) (2.13) 
where n is a unit vector in the direction of A x B. 
It follows that B x A has the opposite direction to A x B but has the same 
magnitude. Therefore 


BxA=-—AxB 


Thus the Commutative Law does not hold for vector products. 
If A and B are parallel vectors, sin 6 = 0. Therefore 


AxB=0 


In particular, A x A = 0 for all vectors A. 
It follows that for the unit vectors i, j, k, 


ixi=jxj=kxk=0 (2.14) 
whilst 

ixj=-—jxi=—k 

jxk=—-kxj=i (2.15) 


kxi=-ixk=j 
For any scalar m, 


mA x B = mAB sin 6n = A x mB = m(A x B) 


DISTRIBUTIVE LAW 


Let A be a vector which is perpendicular to each of two vectors B, and C,. 
To prove that 


Ax (B,+C,) =AxB,+AxC, 


The vector A x B, lies in the plane defined by B, and Cy, is perpendicular 
to A and B,, and has a magnitude AB, (Fig. 2.16). 

The vectors A x C, and A x (B, + C,) also lie in the plane defined by 
B, and Cy, are respectively perpendicular to C, and (B, + C,), and have 
magnitudes AC, and A |B, + C,| respectively. 


scalar and vector products 41 


Figure 2.16 


It follows that if the parallelogram whose sides are B, and C, and whose 
diagonal is (B, + C,) is rotated through +90° about A and is then magnified 
A times, the resulting parallelogram has sides representing A x B, and 
A x C, and a diagonal representing A x (B, + C,). Therefore 


Ax (B,+C) =AxB,+AxC, (2.16) 


Now, to generalize, let A, B, C be non-coplanar vectors, and let B,, B, be the 
components of B, perpendicular and parallel respectively to A (Fig. 2.17). 
The magnitude of A x B,; = AB sin 0 = magnitude of A x B. 
Also, the directions of A x B, and A x B are the same. Therefore 


AxB,=AxB 


B, 
Figure 2.17 


42 scalar and vector products 


Figure 2.18 


Similarly A x C, = A x C. 


Also B + C = (B, + C,) + (B, + C,). Therefore 
Ax (B+C)=Ax (B,+C,) 


Now by eqn (2.16) it follows that 
Ax(B+C)=AxB+AxC (2.17) 


Expressing the vectors A and B in terms of their rectangular components and 
applying the distributive law, we have 


A x B= (Ayi + Aaj -+ Ask) x (Bi + Byj + Byk) 
= (A2B; — A,B,)i + (43B, — A,B3)j + (4,B, — A,B,)k (2.18) 


by virtue of eqns (2.14) and (2.15). 
In determinantal form, this becomes 


i gone 
AxB=|4, 4, 4, (2.19) 
B, B, B, 


Since A x B = AB sin On, it follows that 


(A,B San A3B,)* ce (A3B, Ld A,B;)* ti (A,B, ra A,B,)* 
(A? a A,* ak A;')(B + B + B,”) 


sin? § = (2.20) 


= (m,n, — mgn,)* + (nl, — Nl)? + (hm, — Im)? (2.21) 


scalar and vector products 43 


where (/,, 7, 7.) and (/,, m2, m2) are the direction cosines of A and B re- 
spectively. 


EXAMPLE 2.7 


(A) Find the vector of magnitude 10 which is perpendicular to each of the 
vectors A = 2i —j + k and B = i-+ 3k. 


Solution The required vector will have the direction of 


Pet) kk 
C=AxB=/}2 —-1 1)/=—3i—5j+k 
ii 3 


l 
A unit vector having the direction of C is 738 (—3i — 5j + k). 


Required vector is + 33 G+ 3-H a (3i + 5j — k). 


(B) Show that the area of a parallelogram having sides A and B is |A x Bj. 
Solution Let h be the height of the parallelogram (Fig. 2.19). 
Area = |A| A = |B| sin @ |A] = |A x B| 


Thus the area of a triangle with sides A and B is = 4|A x BJ. 


B 


[\ 


Figure 2.19 


(C) Prove the sine law for a plane triangle. 


Solution Let a, b, ¢ be vectors representing the sides of the triangle ABC 
(Fig. 2.20). Therefore 


a+b+c=0 


44 scalar and vector products 


B Cc 
Figure 2.20 


Take the vector product of each term by a, 


axat+axbt+axc=0 
i.e. 
axb=cxa 


Similarly by taking the vector product by b, a x b = b x c. Therefore 


alx bh = bix €¢ Seon 
ab sin C = be sin A = casin B 


sinA sinB_ sinC 


(This result also follows immediately from (B).) 


(D) Show that the vector sum of the vector areas of the faces of a tetrahedron 
is zero. 


Solution By (B), the area of the face OAB = } [a x bj. 

Thus vector area of the face OAB = (a x b) which has the direction of the 
outward normal to this face. 

Similarly the vector areas of the other faces in the directions of their 
outward normals are 


a(bxec) dexa) F{(c—a) x (b—a)} 
Therefore, the sum of vector areas is 
t[axb+bxe+ecxa-+ (c—a) x (b—a)] 
=tfaxb+bxc+exatecxb—axb—cxal] 
11, 


scalar and vector products 45 


Figure 2.21 


(E) Find the moment of a force F = Xi+ Y¥j+ Zk, which passes through 
a point R (x, y, z), about the origin O (Fig. 2.22). 

The position vector of R is r = xi + yj + 2k. 

The moment vector G of F about O is given by 


rf «& 
G=rxF=,) y 2 
x Y Z 


= (yZ — zY)i+ (2X — xZ)j + (*Y — yX)k 


Figure 2.22 


46 scalar and vector products 


In particular, the moment of F about Ox is ( yZ — zY) = G.i which is a 
scalar quantity. 

Similarly z¥ — xZ = G.j and (x Y — y¥) = G.k are the moments of F 
about Oy and Oz respectively. 


(F) A rigid body rotates with angular velocity w about an axis. Find the 
vector velocity v of the point R of the body whose position vector with 
respect to a point O on the axis is r. 


Solution Let ON be the axis of rotation and let RN be perpendicular to 
ON (Fig. 2.23). The angular velocity of the body may be represented by a 


O 
Figure 2.23 


vector wn where n is a unit vector whose direction is related to the direction 
of rotation in accordance with the convention of Section 2.5. 

R is moving in a circle of radius RN with angular velocity w. Thus the 
magnitude of the velocity v of R is 


oRN = or sin 0 


The direction of v is perpendicular to both n and r and is directed into the 
paper. Therefore 


=o(nxr=wxr 


EXERCISES 2.3 
1, IfA =i — 2j + kandB = 2i + j — 3k, find (a) A x B, (6) B x A, and show 
that (B — A) x (B + A) = —2(A x B). 
[(a) 5G +5 +k), (6) -S@+j)+b) 


scalar and vector products 47 


2. If A = —i + 2j — k and B = 2i — j + 3k, find 
(a) A x B, (6) |A + BI, (c) the unit vector parallel to A x B, (d) (A + 2B) x 
(A — B). 


{(a) 5i + j — 3k, (6) af 35, (c) ie (Si + j — 3k), (d) —3(5i + j — 3k)] 
3. The vertices of a triangle are at the points (5, —1, 1), (4, 1, —2) and (3, 0, 2). 


Find its area. [},/83] 


4. Show that the perpendicular distance from the point A to the straight line joining 
points B and C is 


jaxb+bxe+ec xXal 
|b — ¢| 


where a, b, ¢ are the position vectors of the points A, B, C respectively. Hence 
calculate the perpendicular distance of the point (—5, 2, 3) from the line joining 
the points (—1, 3, —4) and (2, 3,4). [,/(2 882/73)] 


5. A rigid body is rotating about an axis joining the origin to the point (6, —3, 2) 
with angular velocity 14 rad/s. Find the velocity vector of the point (4, 1, 3) of 
the body if distances are measured in metres. [+2(11i + 10j — 18k) m/s] 


6. Show that three points A, B, C, having position vectors a, b, ¢ respectively, will 
be collinear if 


axb+bxece+exa=0 


7. If a particle of mass m and charge e moves with velocity q in an electric field E 
and a magnetic field H, it experiences a force 


F =ecE+eq xH 


If q = ui + vj + wk and E = £j, H = Hk, show that the equations of motion 
of the particle are 


ma = evH mb = eE — euH mw = 0 


where u and 6 are derivatives with respect to time. 


Products of Three Vectors 


Since B x C is a vector, the products A.(B x C) and A x (B x C) each have 
a meaning; the first is a scalar and the second a vector. 


2.8 Scalar Triple Product 


The brackets in the scalar triple product A.(B x C) are often omitted since 
(A.B) x C has no meaning and therefore A.B x C is unambiguous. 


48 scalar and vector products 


Figure 2.24 


Figure 2.24 illustrates a parallelepiped whose sides represent the vectors 
A, B, C (shown as a right-handed system of vectors). 

Let n be a unit vector normal to the parallelogram « formed by the vectors 
B and C and having the direction of B x C. Let h be the height of the 
parallelepiped in the direction of n. Volume of the parallelepiped is 


V = Height h x Area of parallelogram a 
= (A.n)(|B x C]) = A.|B x C/n = A.B x C 


If A, B, C do not form a right-handed system, A.n will be negative and hence 
A.B x C = —V. Therefore 


ABxC=+4V 
Similarly, B.C x A and C.A x B have the value +V, so that 
ABxC=BCxA=CAxB=-V (2.22) 


the + or — sign being taken according as A, B, C do or do not form a 
right-handed system. 
From eqn (2.22), it follows that 
ABxC=CAxB=AxB.C (2.23) 


so that in a scalar triple product, the dot and cross may be interchanged 


scalar and vector products 49 


without changing its value. It now follows that 
AAxC=AxAC=0 


The notation [A B C] or [A, B, C] is often used to denote A.B x Cor A x B.C. 
If three vectors A, B, C are coplanar, the volume of the parallelepiped 
formed by them is zero. Thus A.B x C = 0. The converse is also true. Thus 
a necessary and sufficient condition that three vectors A, B, C be coplanar 
is that [A B C] = 0. 
By eqn (2.18) we have 


B x C = (B,C; — B,C.)i + (BsC, — ByCs)j + (BC, — B2C,)k 
Therefore 
A.B x C = A,(ByC3 — BsC2) + Ao(BsC1 — ByC3) + As(BiC, — BzCi) 
A, Ag As 


= B, B, Bs (2.24) 
Cy C, C; 


Since the sign of a determinant is changed by interchanging two of its rows 
it follows that 


By By Bs By B, B; 
ABxC=-— |A; A, Asi = + [Ci C, Cy = BCXxA 
C, C, Cz Ay Ag As 


as in eqn (2.22). 
The distributive law holds for scalar triple products since it holds for both 
scalar and vector products, thus for example 
[r,a — b,c — d] = [rac] + [rbd] — [rbc) — [rad] 
It is, of course, essential to preserve the order of the factors. 
2.9 Equation of the Plane through Three Non-collinear 
Points 


Let A, B, C be the three non-collinear points which define the plane and let 
R(x, y, z) be any point on it (Fig. 2.25). The vectors (r — a), (b— a), 


50 scalar and vector products 


Figure 2.25 


(c — a) are coplanar. Therefore 


(r — a).(b — a) x (C— a) = 0 (2.25) 
ie.(r—a).[bxce+exataxb+axa]=0. 
rlaxb+bxe+exal=abxe (2.26) 


If A, B, C are the points (x,, y,, Z,), (r = 1, 2, 3) respectively, the equation 
(2.25) may be written in the determinantal form 


(x= 3) 6) Tie Vi) (z — 2) 
(%2—*1) O2— Vi) (2, —2)| =0 (2.27) 
(X%s—%) Os — yi) (3 — 2) 
2.10 Equation of the Plane through a Given Line and Parallel 
to Another Line 


Let the plane pass through the line r = a ++ th and be parallel to the vector 
c. This plane contains the point A and is parallel to the two vectors b and ec. 
It follows that the vector b x c is perpendicular to the plane. Therefore by 
eqn (2.2), 


(rt —a).bxc=0 i.e. [r bc] = [abc] (2.28) 


2.11 The Common Perpendicular to Two Skew Lines 


Let the equations of the two lines be 


r=a-+tb and r=c+sd 


scalar and vector products 51 


Figure 2.26 


Let A and C be the points on these lines whose position vectors are a and c. 
Let PQ, of length p, be their common perpendicular. 


Since PQ is perpendicular to both b and d, then PQ is parallel to N = 
bxd. 


The magnitude of PQ is equal to the magnitude of the projection AC on N. 
Therefore 


p= . (c — a).N = _ (c — a).bxd (2.29) 


It follows that the two lines intersect if (ec — a).b x d = 0 which is otherwise 
obvious since this is the condition that (¢ — a), b and d should be coplanar 
vectors. 


The equation of the plane through the line AP and the common per- 
pendicular PQ is 


(r—a)bxN=0 ie. @—a).b x (bx d)=0 
The plane through CQ and PQ is 
(r — c).d x (b x d) =0 


These two planes determine the common perpendicular which is their line of 
intersection. 


EXAMPLE 2.8 


Find the volume of the parallepiped whose three concurrent sides are the 
vectors 


A=H—-j+k Beda. dk ..c=—-—2i4+5 
a i5.) 6 

+V=aCAxB=( —1 1| = —2(1)—5(—10) = 48 
1 go0~3 


52 scalar and vector products 


EXAMPLE 2.9 

Two straight lines pass through the points A(5, 1, 2) and B(3, 0, 1) having 
the directions of the vectors c= 2i—j+ 3k and d= —i-+ 2j — 2k 
respectively. Find the shortest distance between these lines, and the co- 
ordinates of the feet of the common perpendicular. 


Solution c x dis parallel to the common perpendicular to the two lines, and 
i ;: = 
exd=|2 —-1 3] =—4i+j+ 3k 
—-1 2 -2 
A unit vector n in the direction of the common perpendicular is 
oo —4i + j + 3k 
i /26 


Shortest distance between the lines is 


o (2i + j + k).(—4i + j + 3k) 
\/26 


Let the common perpendicular PQ meet the lines through A and B in P 
and Q respectively. 
The equation of the plane through AP and PQ is [r — a, c, n] = 0, i.e. 


(x—5) QY-1) @—2) 
2 mh 3 | =0 
4 =| . 
6(x — 5) + 18(y—1) + 2(2—2)=0 ie. 3x + 99 +2 = 26 


p =(a—b).n = 4/,/26 


The equation of the straight line BQ is 


This straight line meets the plane in Q for which 
9—3kK+ 18K +1—2k=26 ie. k = 16/13 


Therefore Q is the point ;4,(23, 32, —19). 
Similarly, P may be shown to be the point ;1,(31, 30, —25). 


scalar and vector products 53 


EXAMPLE 2.10 
Find the volume of a tetrahedron. 


Solution Let p be the magnitude of the common perpendicular to the two 
O 


b-c 


Figure 2.27 


opposite edges OA, BC of a tetrahedron (Fig. 2.27). Let m be a unit vector 
parallel to the common perpendicular. Then n will be perpendicular to both 
a and (b — c). Thus 


_ ex o— oc) 
OA.BC.sin « 


where « is the angle of inclination of OA to BC. 
Now p is the projection of CA on n. Therefore 


phd Se, _ (c—a).a x (b—c) 
Bee Aer Se ein 
_ _ fa be) 


= - on expanding the numerator. 
OA.BC.sin « 
The volume V of the tetrahedron is given by 


V = 4a.4(b x c) = 3 [a, b, c] = ?OA.BC.p sin « 


EXERCIS § 2.4 
1. Find the equation of the plane through the points (2,0, 1), (3, 1,5) and 
(-1, 2, —1). 2x + 2y — z = 3] 


54 scalar and vector products 


2. Show that the equation of the plane through the point A parallel to each of the 

vectors b and c is [rb c] = [abc]. 

Hence, or otherwise, find the cartesian form of the equation of the plane 
through the point (—2, 1, 3) parallel to the vectors —i + 2k and 3i + 2j — k. 
[4x — Sy + 27+ 7=0] 

- Show that the equation of the plane through the points A and B parallel to a 

vector c is [r, (b — a), c] = [abc]. 

Hence, or otherwise, find the cartesian form of the equation of the plane 
through the points (5, 1, 2) and (2, 1, 0) parallel to the vector 3i + j — 4k. 
[2x — 18y — 3z + 14 = 0] 

4. Show that the line through A (—5, —8, —3) and B (2, 13, 11) intersects the line 
through C (—7, 0, —5) and D (5, —6, 13). What is the point of intersection? 
What is the point of intersection of AC and BD? [(—3, —2, 1);(23, —120, 25)] 

5. Show that the shortest distance between the straight line through the points 

(4, 2, 5) and (—3, —1, 2) and the straight line through the points (6, —3, —1) 
and (—4, 3, —5) is 241/,/1 522. 

. Show that the equation of the plane which is perpendicular to the plane r.a = 

constant and passes through the line r = b + ¢c is [rea] = [bca]. 

7. Show that the equation of the straight line drawn through the point whose 
position vector is p to intersect both of the lines r = a + th andr =c + sd is 
r=p+k.(m, X ng) where nj =b x (a—p) and n,=d x (c— p). 

8. Show that 


Ww 


ion 


Aa Ab A.c 
[AB C][abc] = |Ba Bb B.c 
Ca Cb Ge 


9. Two forces F, =i+j+k and F, =i + 2j —k act through points whose 
position vectors are 


S,=i+j+2k and S,=9j + 5k 
respectively, relative to a fixed point and in terms of three mutually perpendicular 
unit vectors i, j and k. If the lines of action of F, and F, intersect, find g and find 
the vector equation of the line of action of the resultant of F, and F,. (U.L.) 
[Iq = —2;r = 2(1 + t)i + (2 + 3¢)j + 3k) 
2.12 Vector Triple Product 


The vector triple product 
T=Ax (Bx OC) 


is a vector which is perpendicular to each of the vectors A and (B x C). 
Since (B x C) is a vector normal to the plane of B and C, it follows that T 


scalar and vector products 55 


lies in the plane of B and C and must therefore be expressible in the form 


T= oB+ BC 


Choose coordinate axes, so that B and C lie in the plane of Ox, Oy and 
Ox has the direction of B (Fig. 2.28). Therefore 


B=Bi C=Ci+G,j 


O B x 
Figure 2.28 


Let A= A,i+ A,j+ Ak. Thus, B x C = BC,k and 
vee k 

A, Ag Az 

0 OO BC, 


A x (B x C) 


I 


(AgC2B)i — (A,C2B)j 
= (A,C, + A,C.)Bi — A,B(Cyi + C,j) 


i.e. 
; A x (B x C) = (A.C)B — (A.B)C (2.30) 


It follows that 
(A x B) x C = —C x (A x B) = —(C.B)A + (C.A)B (2.31) 


so that (A x B)x C4A x (Bx C). 
Thus, if the position of the bracket in a vector triple product is changed, 


3 


56 scalar and vector products 


the value of the product is altered, i.e. the associative law is not, in general, 
valid for vector products. 


Products of Four Vectors 


The scalar product (A x B).(C x D) and the vector product (A x B) x 
(C x D) of four vectors occur, not infrequently, in vector analysis. Each is 
readily expressible in terms of scalar products. 


2.13 Scalar Product of Four Vectors 
By eqn (2.23), we have 


PCxD—PxEGD 
Now let P = A x B, then 


(A x B).(C x D) = {(A x B) x C}.D 
= {(A.C)B — (B.C)A}.D__ by eqn. (2.31) 
= (A.C)(B.D) — (B.C)(A.D) (2.32) 


2.14 Vector Product of Four Vectors 


Assume that the four vectors are localized at a point. The vector (A x B) x 
(C x D) is perpendicular to the vector (A x B) and therefore lies in the plane 
of A and B. It is therefore expressible as a linear function of A and B. 

Similarly, this vector lies in the plane of C and D and is expressible as a 
linear function of C and D. It must clearly have the direction of the line of 
intersection of these two planes. 


Now P x (C x D) = (P.D)C — (P.C)D by eqn (2.30) 
Let P = (A x B), then 
(A x B) x (C x D) = {A x B.D}C — {A x B.C}D 
= [AB D]JC — [ABC]D (2.33) 


scalar and vector products 57 


Similarly 
(A x B) x Q = (A.Q)B — (B.Q)A 
(A x B) x (C x D) = [AC D]B — [BC DJA (2.34) 


Equating the two expressions in (2.33) and (2.34) we have the following 
relation between any four vectors A, B, C, D: 


[B C DJA — [AC DJB + [AB D]C — [ABC]D = 0 (2.35) 
[ef. eqn (1.7), linear dependence of four vectors] 
Thus 
_ [BC DJA — [AC DIB + [AB DJC 


showing that any vector D may be expressed as a linear function of three 
other vectors A, B, C provided [ABC] 4 0, i.e. provided A, B, C are not 
coplanar. 


EXAMPLE 2.11 


(A) By expressing the vectors in terms of their rectangular components, 
show that 


A x (B x C) = (A.C)B — (A.B)C 
Solution 


A x (B x C) = (Aji + Aj + Ask) x [(B2C3 — B3C2)i 
+ (B,C, — B,Cs)j + (B,C, — B,C,)k] 
= [A2(B,C, — B,C) — A3(BsC, — B,C;)}i 
+ [A,(B.C3 — BsCz) — Ay (B,C, — B.C,)Ij 
+ [A,(B3C, — B,C3) — A2(B.Cs — BsC;)]k 
= (AC, + AgC, + AsCs)(Bii + Boj + Bsk) 
— (A,B, + A2Bz + A3Bs)(Cii + C2j + Csk) 
= (A.C)B — (A.B)C 


58 scalar and vector products 


(B) Prove that (A x B).(B x C) x (C x A) = (A.B x C)? 
Solution By eqn (2.33) 


(B x C) x (Cx A) = [BCAJC — [BCC]A 
= [ABC]C 
(A x B).(B x C) x (C x A) = [ABC](A x B.C) = [ABC 
EXAMPLE 2.12 RECIPROCAL GROUPS OF VECTORS 


From eqn (2.36), it is clear that any vector r may be expressed in terms of 
three non-coplanar vectors a, b, c ({a bc] + 0) as follows: 


ub {r bcja + [rca]b + [rab]e 
[abe] 
= (r.a’)a + (r.b’)b + (r.c’)e 
where a’ = b x c/[abc], b’ = c x a/[abc], c’ =a x b/[abc]. 


(a, b, c) and (a’, b’, c’) are said to be reciprocal groups of vectors. 
The following relations exist between the two groups: 


abxec 


[abc] —_— ip = c.c 


(i) a.a’ = 


(ii) the scalar product of any other pair, drawn one from each group, is 


bbxc 


abe] 


zero, e.g. b.a’ = 


(iii) if [abc] = V 40, then 


2 
[a’ b’ c'] = — = eee (by Example 2.11B) 
1 
==+£0 
7H 


Thus if a, b, ¢ are non-coplanar, a’, b’, ¢’ are also non-coplanar. 


EXAMPLE 2.13 SPHERICAL TRIGONOMETRY 
Let ABC be a spherical triangle on a sphere of unit radius (Fig. 2.29). By 
definition, the sides of this triangle are arcs of great circles of the sphere. 


scalar and vector products 59 


Figure 2.29 


Let p, q,r be the position vectors of A, B, C with respect to O, the centre 
of the sphere. The side a of the triangle is the angle BOC between the unit 
vectors q and r whilst the angle A of the triangle is the angle between the 
planes AOB and AOC. The other sides and angles are similarly interpreted. 

By eqn (2.32) 


(q x p).(r x p) = (q-r)(p-p) — (4-p)(P-r) 
Therefore 
sin c sin bny.n, = cos a.l1 — cos c.cos b 


where n, is a unit vector normal to q and p drawn out of the paper and n, 
is a unit vector normal to r and p drawn out of the paper. Thus the angle 
between n, and n, is A. Therefore 


ny.n, = cos A and therefore 


cosa = cosbcosc + sinbsinccos A 


There are, of course, two similar formulae for cos b and cos c. 
Again we have by eqn (2.33), 


(r x p) x (© x q) = (.p x gr 


Noting that r is a unit vector perpendicular to both (r x p) and (r x q), the 


60 scalar and vector products 


above equation becomes 
sin bn, x sin an; = [p qr]r 
where ng is a unit vector normal to r and q 
i.e. sina sin b sin Cr = [pqrl]r sin a sin b sin C = [pqr] 
Cyclic permutations of p, q, r and the sides and angles give 
sin 6 sine sin A = sinc sina sin B = [pqr] 


Hence sin A/sin a = sin B/sin b = sin C/sin c. 


EXERCISES 2.5 
. Prove that A x (B x C) + Bx (C x A)+ C x (A x B) = 0. 


2. A vector r is resolved into two components, one parallel to ¢ and the other 
perpendicular to c. Show that the latter is 


—_ 


1 
al xr) xe 


3. Show that 
(i) (A x B) x A=A x (B x A) 
(ii) A x (B x A) xX (A x GC =0 
(iii) [A x B, A x C, D] = [AB C](A.D) 
(iv) (A x B).(C x D) + (B x C).(A x D) + (C x A).(B x D) = 0 
4. Show that (A x C) x B=0 is a necessary and sufficient condition that 
(A x B) x C=A x (B x C). Discuss the cases where either A.B = 0 or 
B.C = 0. 


5. Prove that 
(a xb) x (ec x d) + (a xc) X (b X@) = [acd]b + [abd]c 
Find the values of r and 4 which satisfy the equations 
rxa=b-—/a ra=0 
where a and b are given vectors. 
Show that [ra b] = b? — (a.b)?/a? 


| ate xD A= 0 


a a 
6. Show that the two straight lines 
r=at+ku r=b+hy 


scalar and vector products 61 


intersect if [ya u] = [v b u] and that the point of intersection is expressible in the 
equivalent forms 


7. With reference to Example 2.12, show that 


U 


_b xe 

~ fa’ b’ c’] 

8. Find the set of vectors which are reciprocals of the set 
3—-jtk i+j—-2k 2wu—3j-k 


ra 


a and hence that r = (r.a)a’ + (r.b)b’ + (r.c)ec’ 


1 ii) ae 1 
[a3 T+ 31+ 50, 3 Gi + 5) MO, — FG + 71+ 400] 


3 Differentiation of vectors 


3.1 Scalar and Vector Fields 


If (x, y, z) has a scalar value at each point (x, y, z) of a region, then ¢ is 
said to be a scalar function of position and defines a scalar field ¢ in that 
region. For example, if the electrical potential in a region is given by the 
scalar function ¢ = xy*z — x*y, then ¢ defines a scalar field. 

On the other hand, if a vector A(x, y, z) = A,(x, y, z)i + Ao(x, y, 2)j + 
A;(x, y, z)k corresponds to each point of a region, A(x, y, z) is said to be a 
vector function of position and defines a vector field in that region. For ex- 
ample, if the velocity v in a fluid is given by vy = xyi + x*j — y®zk, then v 
defines a vector field in the fluid. 


3.2 Derivative of a Vector 


Let A(u) be a vector function of a single scalar variable u. Let du be an 
increment in u. The corresponding increment 6A in A is given by (see Fig. 3.1) 


6A = A(u + du) — A(u) 


Figure 3.1 


differentiation of vectors 63 
The derivative or differential coefficient of A with respect to u is defined to be 


lim — 
6u70 du 


(if this exists). It is denoted by - and is itself a vector. Thus 


dA, OA ie A(u + du) — A(u) 


dus bu0dU = bu éu 


if this limit exists. 
Since dA/du is a vector function of u, 

d (= d’A 

& (ee ac” ee 

du \du du* 


may also be found if it exists. Similarly derivatives of higher orders may be 
found. 

(Questions of continuity and differentiability will not be discussed here. 
It will be assumed that all the functions considered are continuous, single- 
valued and differentiable to any order.) 

It is obvious that the derivative of any vector whose magnitude and direc- 
tion are constant is zero. For example 


du du du 
If A(u) and B(w) are vector functions of u, it may easily be shown that 


d dA , dB 
—(A+B)=—+— 
ee my) a de 


for if 6A and 6B are the increments of A and B corresponding to the increment 
ou, 
6(A + B) = (A+ 6A + B + 6B) — (A + B) 


= 6A + 6B 


éu éu éu 


On taking the limits as du — 0, the result follows. 


(A+B 6A . OB 
gee 8) 


64 differentiation of vectors 


Let A be a vector function of the scalar variable u and let u be a scalar 
function of the scalar variable t. Let SA and du be the increments of A and u 
which correspond to the increment df, then 


3A _ 5A. du 
ot éu—s Ot 


is an algebraic identity. 
On taking the limits as dt > 0, we have 


i (3.1) 
dt dudat 


Let ¢ and A be respectively scalar and vector functions of u and let 5¢ and 
6A be their increments which correspond to the increment du. Therefore 


(pA) = (¢ + 56)(A + 5A) — fA 
= d¢A + $dA + 54 8A 


i.e. 
O(pA)  d¢ 6A . 6¢ 
——=—A —+—dé6A 
éu éu +f ou © ou 
In the limit as du + 0, 
d(pA) dd dA 
=—_A — 3.2 
du du + ¢ du G2) 


Writing A in terms of its rectangular components, we have 
A(u) = A,(u)i + A2(w)j + As(u)k 


where i, j, k are constant unit vectors, so that 


dA _dAy, | day, day 


3.3 
du du du du “» 


differentiation of vectors 65 


3.3 Curves in Space 


Each value of u leads to a single value of A. Let A be the position vector of P 
relative to an origin O. As u varies continuously so does A(u), and P traces 
out a continuous curve in space (Fig. 3.2). 


dA 
du 


8A 
du 


oO 


Figure 3.2 


Let P, P’ be the positions of the moving point corresponding to u and 
u + du respectively. Then 


OP = A(u) and OP’ = A(u+ du) 
Therefore 

dA = PP’ = A(u + du) — A(u) 
so that 5A/du is a vector parallel to PP’. 


In the limit as du — 0 the chord PP’ of the space curve tends to coincidence 
with the tangent to the curve at P. Thus 


dA . OA 
— = lim — 
du u0du 


is a vector having the direction of the tangent at P in the sense of increasing u. 
We may write 


A(u) = A,(u)i + Ao(w)j + A;(u)k 


66 differentiation of vectors 


so that the space curve has the parametric equations 
x = A,(u) y = A,(u) z= A;(u) 
and 


dA dA, * dA, dA, 
pit raat ade I paced | — 3, 
du du sh du id du 


Using the customary notation, let r(u) be the position vector of the point 
P(x, y, z) so that 


r(u) = x@)i + yj + zk 


If now the parameter u represents the time ¢, dr/dt will represent the vector 
velocity v of the point P in its path and is given by 


dr_ dx. dy. dz 
dt dt dt dt 
and v has a direction which is always tangential to the path of P. 


Again if A represents the velocity vector v, dv/dt will be the acceleration 
vector a. 


In kinematics, the abbreviations i, ¥,... are widely used to denote first, 
second,... derivatives of r with respect to the time ¢t. Thus vy=fr, a= 
Vv =f, and so on. 


EXAMPLE 3.1 
(A) A particle moves along a curve whose parametric equations are 


x=asinpt y=acospt z=bt 


where f is the time (t > 0) and a, b, p are constants. Determine the velocity 
and acceleration vectors at time ¢ and show that their magnitudes are constant. 


Solution The position vector of the particle is 


r = (asin pt)i + (acos pt)j + (bt)k 


v= = (ap cos pt)i + (—ap sin pt)j + bk 


a= fs = (—ap* sin pt)i + (—ap* cos pt)j + 0k 


differentiation of vectors 67 


Therefore 
lv] = ./(a®p* cos” pt + a®p’ sin® pt + b®) = ./(a*p* + b?) 
|a| = ap* 


(B) With reference to (A), find the components of the velocity and accelera- 
tion at time ¢t = 0 in the direction of the vector 3i + 2j — 6k. 


Solution At time t = 0, v = api + bk 
ce —ap*j 
A unit vector in the direction of the vector 3i +- 2j — 6k is 


3i + 2j — 6k 


= (0) sa 
. y 


The components of vy and a in this direction are respectively 


v.d = (ap — 2b) and ad = —ap* 


EXERCISES 3.1 
1. Given A(u) = (a cos u)i + (a sin u)j + (au cot «)k, show that 
an A 
du du* 


2. A particle moves along a curve whose parametric equations are 


=acoseca and —@ 


x= 3t yea g=t-—t 


where ¢ is the time. Show that the components of its velocity and acceleration 
in the direction of the vector 2i — j — k at t = 2 are respectively 9/,/6 and 8/,/6. 


3. A particle of mass m is moving in a plane under the influence of a force F so 
that its cartesian coordinates in the plane at time ¢ are [3a sin (t/to), 3a cos (t/to)] 
where a is a constant length and fy is a constant time. Describe the motion of the 
particle as fully as you can and find the time which elapses before the particle 
returns next to its starting point at t = 0. 

Express the components of the particle’s acceleration in the directions of the 
two axes and hence prove that F is of constant magnitude. 

Describe carefully how the direction of F varies with time. At time t = 47%, 
the force F is suddenly annihilated. Find the position of the particle at tf = 87f. 

Describe a physical situation of which this question could be a mathematical 
model. (Oxford and Cambridge A-level, S.M.P) 


68 differentiation of vectors 


[Uniform motion in a circle radius 3a; angular velocity 1/9; initial position 
r = 3aj; returns after time 27t); F of constant magnitude 3ma/to?; when t = 
8aty, r = 12zai + 3aj] 


3.4 Derivatives of Scalar and Vector Products 


The method employed to find the derivatives of the products of algebraic 
functions is directly applicable to the various products of vectors. It will be 
sufficient to use this method to prove that 


d dA dB 
—(A x B) =—xB-+Ax — 3.4 
Pie ) du * du ai 


Let 5A and 6B be the increments in A and B corresponding to the increment 
du then 


(A x B) (A+ 6A) x (B + 6B) —A xB 
éu bu 


_ 6A x B+ A x 6B + 6A x 6B 
éu 


Now taking the limit as du — 0, the result follows. In eqn (3.4) the order 
of the terms in the products is, of course, significant. 
In a similar manner, it can readily be shown that 


dB 


5 
ms (3.5) 


d dA 
—~ (A.B): = —.B + A; 
du Sane: du - 


from which it follows that 


Also 


dA 


d d 
— (A.A) = — (A’) = 2A 
au‘ ) du | ) du 


differentiation of vectors 69 


Therefore 
aA == A dA (3.6) 
du du 


If now A is a vector of constant magnitude and variable direction, i.e. A 
is constant, 


A =0 and a is perpendicular to A 
du du 


Scalar and vector triple products may be readily differentiated by applying 
eqns (3.4) and (3.5). The following formulae are obtained 


< [ABC] = bn BC + [asec os [Bsc] (3.7) 


fuxsexoy-4xexo 
du du 


+x (Bc) +ax (px) (3.8) 
du du 


In eqn (3.7) the cyclic order must be preserved and in eqn (3.8) the order of 
the terms in the products is significant. 


EXAMPLE 3.2 
(A) If ais a constant vector and r is a vector function of the scalar variable 
t, find the derivative of r°r + a x dr/dt with respect to f¢. 


Solution 
ais i) ies a a) 
a a) ee Si» & 
a(t a eo a a 
dr dr d’r 
ee 3r? — x — 
: yas 7 are dt® 
i.e. 


f (fr + a xi) = rt Stipa xe 
t 


70 differentiation of vectors 


(B) Show that : [rif] = [rif] 


“(ti = [tte] + [rte] + [rif] 
= [rif] 


since the first and second scalar triple products on the right-hand side are 
each zero. 


(C) If r = acos nt + bsin nt, when a, b, n are constants, show that 


(i)f+nx%=0 (i) rxtr=naxb 
(i) § = a(—n sin nt) + b(n cos nt) 
F = a(—n? cos nt) + b(—n? sin nt) = —n*r 
F+ nr=0 
(ii) r x f = (acosnt + bsinnt) x n(—a sin nt + bcos nt) 
= n(cos* nt + sin? nt)(a x b) 
=naxb 
EXAMPLE 3.3 


Find the radial and transverse components of the velocity and acceleration 
of a particle moving on a curve. 


Figure 3.3 


Solution Let the position of a point R on the curve (Fig. 3.3) be specified 
by its polar coordinates (r, 0), 9 being measured from the direction of a fixed 
vector ¢. Let r,, s, be unit vectors along and perpendicular to OR respectively 
moving with OR. 


differentiation of vectors 71 


Figure 3.4 


As the time ¢ varies, the point P, whose position vector is r;, moves in a 
circle of unit radius (Fig. 3.4). If P, P’ are successive positions of this point 
at times f, ¢ + dt, then arc PP’ = 60. 

Now or, = PP’ and, in the limit, as 69 — 0, PP’ has the direction of the 
tangent to the circle at P, i.e. has the direction of s,, and |PP’| has the 
magnitude 60. Therefore 


dr rlsOr 


—_ = S$; 
dO = 500 00 
dr = dry a6 = bs, 
dt d6 dt 
Similarly 
ds 
—! =— or, 
dt 


Now the velocity v of the particle at R (Fig. 3.3) is 


or od 
i 
= rr, +7 rr, + rOs, 


Thus the radial and transverse components of the velocity are * and ré 
respectively. 


72 differentiation of vectors 


Now 


Ov. iad 
=—=— {7 6 
a “ae {rr, + rOs,} 


fe _ ar. Z : . ds 
ity + a ee + ni 


= fr, + rés, + (76 + rd)s, — r6*r, 


= (F — r6*)r, + (276 + r6)s, 


The radial and transverse components of the acceleration are respectively 


(F — r6?) and (276+ rd) or ia (r*0) 
r dt 


EXAMPLE 3.4 

A point P has coordinates (x, y) with respect to axes Ox, Oy which are 
rotating about O in their own plane with uniform angular velocity w. Find 
expressions for the velocity and acceleration of P. 


Solution The position vector of P is 
r= xi+ yj 


where i, j are unit vectors rotating with angular velocity w. Therefore 


dj 
dt 


ee eee 
dt 
= (¥ — wy)i + (J + ox)j 
since di/dt = wj and dj/dt = —wi by Example 3.3. 
a =F = (¥ — oy)i+ %& — oy)oj + (¥ + o%)j + O + ox)(—ai) 
= (¥ — 2wy — w*x)i + (§ + 2wk — w*y)j 


Clearly (Xi + jij) is the acceleration of P relative to the rotating axes. Note 
that this is the total acceleration of P if the axes were stationary (w = 0). 


differentiation of vectors 73 


The terms —w*(xi+ yj) = —wr comprise the centripetal acceleration 
whilst the terms 2w(—yi + xj) are the so-called Coriolis accelerations which 
are zero when w = 0. 


EXERCISES 3.2 
1. If A = 627i — 1} + uk and B = 2 cos ui — 2sinuj + k, find 


(i) es (A.B) (ii) < (A xB) (iii) £ (A.A) 


[@) 2u(12 + u*)cosu — 6v? sinu +1 (ii) {2(sinu + ucosu) — 3u7 hi 
+ {2(cosu — usin u) — 12u}j — 2{(12u + uw) sin u + 3u? cos u}k 
(iii) 2u(1 + 72u? + 3u4)] 


2. Show that 


(a) £ (Am?) = mt.é (6) ES (fF Xt) =r Xf 


(x @ xD} -FXEXH +r xX EXD 


dB dA @B dA 
ig AX oe Oe 


d 
Q7A*T-H xB 


3. A particle moves on the circumference of a circle of radius c with a constant 
angular velocity w. Show that the position vector of the particle may be expressed 
in the form 


r =ccos ofi+ csin fj 


Show that its acceleration a = —*r and that its velocity v is such that rv = 0 
andr x v = cok. Interpret these results. 


4. With reference to Example 3.3, show that 


d*r d* 

erg =s,$ —r,6? and oo = —r,6 — 8,62 

5. An insect is crawling at a constant velocity v along a spoke of a bicycle wheel 
towards the circumference whilst the wheel is rotating with constant angular 
velocity w. Find the magnitude and direction of the velocity and acceleration of 
the insect when it is at a distance c from the centre. 
[./(v? + wc?) at tan“ (we/v) to the spoke; o,/(4v® + w%c*) at tan“? (—2v/ac) 
to the spoke] 


3.5 The Elements of Differential Geometry 


We have seen that the position vector function r(u), where u is a scalar, 
defines a curve in space. If the scalar u is taken to be s, the length of the arc 


74 differentiation of vectors 


O 
Figure 3.5 


measured from any fixed point A on the curve to a variable point R (Fig. 
3.5), then the position vector of R is a vector function r(s) of the scalar s. 
Let R, R’ be the points r, r + dr corresponding to s, s + ds respectively. 
The chord RR’ = or and the arc RR’ = ds. As ds — 0, the ratio (chord RR’/ 
arc RR’) tends to unity. It follows that 


or : 
— | = him 
os és-0 arc RR’ 


chord RR’ | 


id 51 


ds 5s>0 


Also since dr/ds is a vector parallel to dr, in the limit as 6s — 0, its direction 
will be that of the tangent to the curve at R. Therefore 


—=lim—=T 3.9 
ds meat # eo 


where T is a unit vector having the direction of the tangent to the curve at R 
in the sense of s increasing. 


Now r(s) = x(s)i + y(s)j + 2(s)k, so that 


dr axe ay... az 
DF co "Pen OPE Ae pats Shy 
. Pet Hee 


Thus (dx/ds, dy/ds, dz/ds) are the direction cosines of the unit tangent T. 
Since T.T = 1, then T.dT/ds = 0, and therefore 


es is perpendicular to T 
§ 


differentiation of vectors 


75 
Let N be a unit vector in the direction of dT/ds, then 
oF eek (3.10) 
ds 


N, which is perpendicular to T, is called the principal normal, « the curvature, 
and p = 1/« the radius of curvature of the curve at R (Fig. 3.6). 


< 


x 


Figure 3.6 


Let B be a unit vector such that B = T x N. Then T, N, B form a right- 
handed system of unit vectors. 


BHD ns ty 
ds ds ds 
Seeens sx Sars 
ds ds 
Therefore 
TT 2 
ds 


. is perpendicular to T 
Ss 


76 differentiation of vectors 


But B.B = 1, therefore 


Thus dB/ds is perpendicular to both B and T and is therefore parallel to N. 
Therefore 


aB__seN (3.11) 
ds 


B is called the binormal, + the torsion and o = 1/7 the radius of torsion of 
the curve at R. 
Since N= Bx T 


a ee 
ds” Jas ds 


= -—7NxT+Bxk«N 
= TB — xT (3.12) 


The formulae given in eqns (3.10), (3.11) and (3.12): 


Baum Boi oa (3.13) 
ds ds ds 
are known as the Frenet-Serret formulae. 

The osculating plane to the curve at the point R is the plane containing 
the tangent and the principal normal. The normal plane and the rectifying 
plane are planes through R which are normal to the tangent and the principal 
normal respectively. 


3.6 Equations of Tangent, Normal and Binormal 

Let r, be the position vector of the point R, on a curve and let T,, N,, B, 
be the unit tangent, normal and binormal at R, respectively (Fig. 3.7). Let 
r be the position vector of any variable point R. If R is on the tangent at Ry 
then (r — r,) is parallel to T,. Therefore 


(®—r,) xT,=0 (3.14) 


which is the equation of the tangent at R,. 


differentiation of vectors 77 


O 
Figure 3..7 


Similarly, the equations of the normal and the binormal are respectively 
(r—r,)x N,=0 (r—r,) x B, = 0 
If R is in the osculating plane, then B, is normal to (r — r,). Thus 


(r — 1,).B, = 0 (3.15) 


which is the equation of the osculating plane at Rj. 
Similarly, the equations of the normal and rectifying planes are respectively 


(rr = ¥,).T, = 0 (r — r,).N, = 0 


EXAMPLE 3.5 
(A) Show that 


’ 2, 
dee ds* 

dr dt 

oy, tL 
wn " 


ay (artery _ 
( heise hil 


78 differentiation of vectors 


Solution 
ae 
1) — = FE 
(i) s 
Thus 
dt = a = KN 
ds* ds 
i.e. 
at 
ds* 
2, 
(ii) See oe a KB 
ds ds* 
d d dk dN 
—= N) =—N 
(iii) 753 («N) + 2. 
= ae N + «(7B — xT) 
ds 
Thus 


2 3, 
s ds Ss Ss 


(B) Show for the plane curve y = f(x), z = 0, that p = (1 + y’?)8/2/y” where 
a dash indicates differentiation with respect to x. 


Solution xr = xi+ yj. Therefore 


dr dr /ds i+ yj 


differentiation of vectors 79 


since ds/dx = |dr/dx|. Therefore, if N is the unit vector, 


aT _N_ dT /ds 


ds p ~~ dx/ dx 


_a+y)*y'i — + y'Db’y"/0 + v1 


1 12,—1/2 
a+ y’) pat dh 


= = ae 
aQ+y7" a+ty)"* (a+ yy)" 
where N is the unit vector (—y’i + j)/(1 + y’*)¥* 


(A aE yy 


MY 


Therefore p= 


EXAMPLE 3.6 
A curve is defined by the parametric equations 
x=tants y=(I//2)log(l+s%) z=s-—tans 
where the parameter s measures the length of the curve. Find 7 and « and 


the equations of the tangent and of the osculating plane at the point s = 1. 


r = (tan s)i+ (73 log (1 + hi + (s — tan” s)k 


Therefore 
= ty oh + sth} 
ds (1 : s*) 
aT ay 1+ SY V2i + 25k} — Doll + 5/2) + sk} 
he (1 + s*)? 
_ _v2_ {+y2si + (1 — sj + /2sk} 
tp (1 + s?) (1 +s’) 


80 differentiation of vectors 


Therefore 
2 
k= ri 9 and N= ee {—,/2si + dee Pet we /2sk} 
Now 
: i j k 
p= Teh = oe 1 J2s s* 


—/2s (1—s*) /2s 


i : i + St — 520 + s+ + 9k} 


“ii - Gas {s*i — s,/2j + k} 


4B i aa J/2(s* — 1)j —2sk 
din « Ga (1 +s’)? 


(1 + s*) (1 + s*) 
oe 
(1+ a 
Therefore 
J/2 
= nie 
(1 + s*) 


At the point s = 1, ie. x; = tan“ 1 = }y, y, = (1/,/2) log 2, 2, = 1 — x, 
tr=KxK=1//2 Ti =i+ /2j)+hb 


=2(-—J2i+/2kK) B,=i(i— J2i+ 


differentiation of vectors 81 


The equation of the tangent at r, on the curve is given by 
(r—r,) x T,=0 

or by 
(r —r,) =(T, 


where ¢ is a parameter. In either case, 


x— jr y — (1/,/2) log 2 z—1+17 


1 af@ 1 
The equation of the osculating plane at r, is given by 
(r —r,).B,=0 or [(r —r,), T,,N,] =0 


In either case, 


(x — 4r).1 + (» ee 2).(—2) +(z—1+47).1=0 


J/2 

i.e. x — y,/2 + z = 1 — log?2. 
EXAMPLE 3.7 
The circular helix is a curve lying on the surface of a cylinder cutting its 
generators at a constant angle. Taking the axis of the cylinder as the z-axis, 
the parametric equations of the helix will be 

x=acost y=asint z=atcota 
where a is the radius of the cylinder. 


r = (acost)i+ (asin t)j + (at cot «)k 


Now 


82 differentiation of vectors 


But 

ds dr 27.3.2 2 2 2 

Po A caw V{a*(sin* t + cos® t) + a® cot? «} = acoseca 
Therefore 


T = {(—sin t)i + (cos t)j + (cot «)k} sin « 


Now k is a unit vector parallel to the generators, and therefore the angle of 
intersection ¢ of the curve with a generator through the point ¢ is given by 


cos $ = T.k = cos « 


Therefore ¢ = % = constant. Thus the helix cuts the generators at a constant 
angle «. 


aT 
— = KN = {(—cos t)i — (sin 1)j} sin « 
ds a 
Therefore 
ey 
a = = 2% and N= —{(cos ti + (sin tj} 
Ss 


Therefore N.k = 0, so that the principal normal is perpendicular to the axis 
of the cylinder and intersects it. 


i j k 
B=Tx N= —sina|—sint cost cota 
cost sint 0 


= —sin a{(—cot « sin t)i + (cot « cos t)j — k} 


Now B.k = sin «, and therefore the binormals are inclined at a constant 
angle (37 — «) to the generators. 


dB sin 
_ = —TN = sin a{(cot « cos t)i + (cot « sin pj} —= 
s a 
ei da sin « Cos « N 


a 


differentiation of vectors 83 
Therefore 


7 =-—sinacos« 


a ie 


EXAMPLE 3.8 


Find the tangential and normal components of the acceleration of a particle 
moving on a curve. 


Solution Let t be time, and s the arc distance. Then 


dr ds dr 
ae and f= — = | — 
dt dt dt 
Therefore y = vT and 
ay ad dv dT 
— = — (rT) = — T ae 
eg a a 
dv dT ds 
ae ee a 
= ep a 
dt p 


Thus the tangential and normal components of a are dv/dt and v*/p respec- 
tively, and a lies in the osculating plane. 


EXAMPLE 3.9 
The equation of a curve in space is given by r = r(t). Show that 


jr x F| [i, f, F] 
c—=——"_._ and =o 
lz? lt x #/? 
Solution 
fee i.e. t = ST 


dt dsdt 


84 differentiation of vectors 


and § = |i| if s is measured in the direction of increasing t. Therefore 


F=sT +8557 + stN 
Ss 


Therefore f x Ff = cB i.e. K = |t x F]/lé |? 


Now 
¥ = 5ST + «KsiN + 4 («s9)N + «$*(7B — xT) 
Therefore 
[t, fF, F] = (& x ¥).F = («s*)*7B.B 


_ bh tF 
lt x F/? 


EXERCISES 3.3 
1, Show that a curve, whose curvature is everywhere zero, is a straight line. 


2. Show that for a plane curve, t = 0 everywhere. 


3. Show that (i) - x = «7T + °B 


aU at a d [rt 
wo fF at de - #5(2) 


4. Verify that the Frenet-Serret formulae may be written in the form 


az dN dB 


where A = rT + «B. 
5. Find the radius of curvature of the curve whose position vector is given by 
r = (acos u)i + (bsin u)j 
a, b being constants. Comment on this result for a = b. 


[(a? sin? u + b* cos? u)8/2/ab] 


differentiation of vectors 85 


6. Show that for the plane curve x = x(t), y = y(t) 

own ae 
G+ PPA 

7. Find the vectors T, N and B for the curve defined by 

r=ti+ f°) + 35k 
and show that « = r = 2/(1 + 217). 

(T = (1 + 2¢2)7@ + 2¢j + 217k) 
N = (1 + 22){ —2ti + (1 — 227)j + 2tk} 
B= (1 + 22)7°7 Qi — 24) + kK) 


8. The parametric equations of the twisted cubic are x = t, y = ¢®, z = t8. Show 
that 


ba 2(9t* + 942 + 1)? ju 3 
(98* + 423 + 14 514 + 987 + 1 
9. Using the results of Example 3.9 show for the curve defined by 
r= (3¢ — fi + 3097 + Gt + tk 
that « = r = 1/3(1 + £2)”. 
10. For Exercise 9 show that 
T =27(1 + 22-1 — ¢?)i + 2e9 + (1 + £%)k} 
N = —(1 + ¢4)“{2ri + (¢? — 1)j} 
B = 2-421 + ¢2)-4{(e? — 1)i — 243 + (1 + 24k} 


Obtain the equations of the osculating, normal and rectifying planes at the 
pointr=1. [y—z+1=0,y+z—-—7=0,x =2] 


11. Show that for the curve x = 16 cos*t, y = 16 sin*t, z = 9 cos 2t, 
T = 4{(—4 cos t)i + (4 sin t)j — 3k} 
N = (sin t)i + (cos £)j 
B = 1{(3 cost)i — (3 sin t)j — 4k} 


Find the equations of the tangent, normal and binormal and of the osculating, 
normal and rectifying planes at the point t = 7/3. 


¥-2 y-6/3 x+9/2 
| tangent: =a See oe 
x-2 y-6/3 2+9/2 
ee a . 
- — 6,/ 9/2 
egies NS et | 


86 


differentiation of vectors 


Osculating plane: 3x — 3,/3y — 8z + 12 = 0; normal: 4x — 4,/3y + 6z + 
91 = 0; rectifying: \/3x + y — 8,/3 = 0] 


. Show that the acceleration vector a of a particle moving on a curve always lies 


in the osculating plane. 


. A particle is moving on the curve 


r = 3f5i + (¢ — 2t2)j — 21°k 


where ¢ is the time. Find the magnitudes of the tangential and normal com- 
ponents of its acceleration at time t= 1. [41,/(2/7); ,/(26/7)] 


. A particle, moving on a curve, has instantaneously a velocity vy and an accelera- 


tion a. Show that the curvature of its path at this instant is given by 


k = oT ly x al 


4 Vector integration 


4.1 Integral of a Vector Function of a Scalar 


If F(u) is a vector function of a scalar u such that 


dF 
—=R 
ry (u) 


then we say that F(w) is the integral of R(u) with respect to u, i.e. 
F(u) = {RG du 


as in the case of algebraic functions. 
If ¢ is an arbitrary constant vector, 


d dF 
—(F =—=—=R 
= + ¢) a, (u) 


[RO du=F+e 


F(u) is called the indefinite integral of R(u) since it is indefinite to the extent 
of the arbitrary constant of integration c. A definite integral, between 


4 


88 


specified limits u = a and u = 4, is denoted by 


i "R(u) du 


vector integration 


and may be evaluated in the same way as definite integrals of algebraic 


functions. 
From section 3.4, it follows that 


[(Gie+a4 ) du = AB + ¢ 


d 
and 
[(GaxBsax®) au axBre 
du du 


From (4.1) we have 

[ace du = 4AA + c= 44" +c 
and hence 

2 [rt dt =(r)*+¢ 

2 fit =( +e 
From (4.2) we have 


[@xterxpdaretye 


[expaarxite 


(4.1) 


(4.2) 


(4.3) 


(4.4) 


(4.5) 


vector integration 


EXAMPLE 4.1 


If R(u) = wi + (1 — u)j — 3u*k, find J R(w) du and fi R(u) du. 


[Rw au = if du +ifa —u) du — k 3u* du 
= i+ (u — Wu)j—uk +c 
[Reo du = ui + u = du — we 


= 16 — i+ (2-—1-—2+ p)j—(@— Dk 


= }{15i — 2j — 28k} 
EXAMPLE 4.2 
The acceleration a of a particle at time f is given by 


a = (cos f)i + ej — 6tk 


89 


If the velocity v and the displacement r at time t = 0 are —j and —i respec- 


tively, find v and r at time f. 


Solution 


dy 
= — = (cos t)i + ej — 6tk 
a pi (cos t)i + ej 


Therefore 


v= i {(cos thi + ej — 6tk} dt 


= (sin )i —e*j—3°k+¢ 
When t = 0, v = —j. Therefore 
—j=-jt+e c=0 
dr ae -t 3 
v= — = (sin t)i — ej — 3t"k 
It (sin t) j 
r= —(cost)i+ e*j—tk+d 
When t = 0, r = —i. Therefore 


—i=-i+j+d d= -j 
r = —(cos t)i+ (e* — lj — *k 


90 vector integration 


EXAMPLE 4.3 
The position vector r of a moving particle is given by the following equation 
of motion 
f+ nr=0 (see Example 3.2(C), p. 70) 
To integrate this equation, form the scalar product with ¢, and we have 
2¢.F + 2n*r.t = 0 
On integrating with respect to t, we have 
(iP +O)? =c or (}? =v? =c— pr)? where wp =n? (4.6) 
Again on taking the vector product of the equation of motion with r, we have 
rxf+earxr=0 ierxf=0 
On integrating with respect to t, we have 
rxf=h a constant vector 
1.€. 
rxv=h = pok (4.7) 


where p is the perpendicular from O to the tangent to the path at R (Fig. 4.1) 
and k is a unit vector having the direction of the vector r x v. Therefore 


po=h (4.8) 


Figure 4.1 


vector integration 91 


Since h is a constant vector perpendicular to the plane of r and vy, the particle 
moves in a plane curve. (See section 4.9.) 


EXERCISES 4.1 
1. If A@u) = wi + (u — 12j — uk and B = —ui + (u + 1)j — u*k, find fj A.B du 
and fA xBdu. [6,341 + 925) + 74kI 


2. If A(t) = 2ti — 9) + (1 — #*)k, find JA. (@A/dt) dt [42] 
4.2 Line Integrals 
Let C be a curve defined by the position vector 

r(u) = x(u)i + yj + zk 


and let u = u,, u = u, determine two points P, Q respectively on C. 
Let A(x, y, z) = Aji + Aj + Ask be a vector function of position which 


Figure 4.2 


is defined and continuous at all points on C. See Fig. 4.2. Now 
A.dr = A cos 0 dr = dr X the tangential component of A 


The line integral of A along the curve between P and Q is defined to be the 


92 vector integration 


integral of the tangential component of A between these points, i.e. 


Q Q 
I “ec (A, dx + A, dy + Ay dz) 
P P 


If a particle is moving along C under the action of a force F = Xi+ Yj+ 
Zk then the line integral 


Q Q 
| Fade =| (X dx + Y dy + Z dz) 
P P 


represents the work done by F in moving the particle from P to Q. If Cis a 
simple closed curve, i.e. a closed curve which does not intersect itself, the line 
integral of A around C is denoted by 


If the velocity V of a fluid is given by V = ui + vj + wk, then in aero- and 
hydrodynamics 


§ Vadr =f (u dx + ody + wads) 


is termed the circulation about the curve C. 


EXAMPLE 4.4 

A particle moves in a force field F = x*i + (y? — z*)j + z*k. Find the work 
done by F in moving the particle along 

(a) the straight line joining the points (0, 0, 0) and (1, 2, 1) 

(6) the curve x = #?, y= 278, z= 1+ 2 fromt=0tot=1 

(c) the curve of intersection of the surfaces 2y = x and y* = z from y = 0 
to y = 1, i.e. from the origin to the point (2, 1, 1). 


Solution 
(a) The equation of the straight line is 


——— = — =< ft Le.x=2=>t y =2t 


vector integration 93 
Work done = | F.dr =| dx + (y® — 2”) dy + 2° dz} 

1 

-{ {t? dt + (41 — t*) d(2t) + #dt} 
0 
¥ 

-{ (t° + 7t*) dt 
0 


77 
= — bes ae = 2 
z . an v 


1 
(b) Work done -{ [1°37 dt + {41° — (t + 2)*}4t dt + (t + 2) dt] 
0 
1 
=| (3t8 + 16¢° — 32° — 102? — 41 + 8) dt 
0 


c. Sf 38 ie ‘ 1 
— —-— —— — 24+ 8t| = 
[5 a) 4 3 * I, 44 


(c) Work done =| dx + (y® — z*) dy + z° dz} where x = 2y*,z= y* 
1 
-{ {4y*4y dy + (y? — y*) dy + y®2y dy} 
‘1 
-{ {16y° + y® — y* + 2y"} dy 
0 


gy® 3 1 
- [F+e-2 42] aan 
0 


EXAMPLE 4.5 


Calculate the circulation of a vector V = (x — 2y)i+ (2x + y)j in an 
anticlockwise sense around the closed curve bounded by the curves y = x, 
8x = y’ in the plane z = 0. 


94 vector integration 


Figure 4.3 


Solution These two boundary curves meet at the point (2, 4) (Fig. 4.3). 


Circulation = $ V.dr 
- [te — 2y) dx + (2x + y) dy} 
2 
= {(x — 2x*) dx + (2x + x?)2x dx} 


MUG 2)? + C+) 4} 


since y = x? along OAP and x goes from 0 to 2 whilst x = y?/8 along PBO 
and y goes from 4 to 0. Therefore 


2 4 3 
Circulation -{ (x + 2x? + 2x*) dx ~{ S —=—+ y) dy 
0 0 


2 2x3 442 4 3 244 
=f FN [B-2 21m 
0 


vector integration 95 


EXERCISES 4.2 
1. If A = (2x — y)i + (x + z)i + (y + 2)k, evaluate f A.dr along the following 
curves: 
(a) straight lines from the origin to (0, 0, 2), thence to (0, 1, 2), and finally to 
(2; 1, 2) 
(b) x = 2t, y = 1°, z = 21? from the origin to the point (2,1, 2) [(a)6, (6)9] 
2. Find the circulation of the vector 


A= (2 — pit G2 + yi 


around the triangle whose vertices are (0, 0), (0, 3), (1, 3) in the counterclockwise 
sense (Fig. 4.4). [7] 


Figure 4.4 


3. Show that the line integral of the vector 
A = (2xz + y®)i + 2xyj + x°zk 


along a curve between two given points is independent of the curve. 
(Hint: show that (2xz + y”) dx + 2xy dy + x®z dz is an exact differential] 

4. Show that the circulation of the vector (xy — z)i + (yz — x)j + (zx — y)k in 
the clockwise sense around the circle x* + y® = 4, z = 2 is 47. 


4.3 Surface Integrals 


Let S be a surface having two sides. Either side of the surface may be 
arbitrarily chosen as the positive side. Let dS be an element of area on the 
surface and let n be the unit normal to the surface at dS drawn outwardly 
from the positive side. 

The vector area dS is taken to be a vector of magnitude dS having the 
direction of n, i.e. dS = dSn. 


96 vector integration 


Integrals of the type 


If. A.dS -{f. A.n dS 


evaluated over the surface S arise frequently in applied mathematics. 
Jf A.dS is called the flux of A over S, and is a scalar quantity. 


[a x dS =[[_axnas 


is another example of a surface integral—a vector quantity. Surface integrals 
may be evaluated in terms of double integrals taken over the area obtained 
by projecting S on to one of the coordinate planes. In Fig. 4.5, S,, is the 


Zz 


x 


Figure 4.5 


projection of S on the x,y plane and the surface integral may be found by 
integrating over S,., with respect to x,y or other appropriate coordinates. 

This procedure is possible provided that perpendiculars to the x,y plane 
meet S in one point only. If this is not so, S may be divided into sub-areas 
each of which satisfies this requirement. The surface integral is then found 
as the sum of the integrals over the sub-areas. 


vector integration 97 


Consider the evaluation of ff, A.dS = ff, AmdS. Let dxdy be the 
projection of dS on the x,y plane (Fig. 4.5). Since n is a unit normal to dS 
and k is a unit normal to dx dy, 


dx dy 
(n.k) 


dx dy = dScos0=(n.k)dS  ie.dS= 


Therefore 


[Jjaas—[[, ane (4.9) 


4.4 Normal to a Surface 
The vector differential operator del, denoted by V, is defined by 


i+—j+—k (4.10) 
x 


Let (x, y, z) be a defined and differentiable function in a given region. The 
gradient of ¢ is denoted by grad ¢ or V¢ and is defined by 


CAPs 0g. 4%, a¢ 
Vo = (Ri+5, bis 2 x)e= = +3373, * (4.11) 


Let r = xi+ yj + zk be the position vector of a point (x,y,z) on the 
surface ¢ = constant, then dr = dx i + dyj + dzk lies in the tangent plane 
to the surface at (x, y, z). 

Now d¢ = ¢(x + dx, y + dy, z + dz) — $(x, y, z) is of the second order 
of small quantities. Therefore 


dd d¢ dd 
FS i ie ee 
as ey eee 


] C] a 
i.e, (Ti+ 145 = k).(dxi + dyj-+dzk) =0 

ox oy dz 
so that Vd.dr = 0. Thus V¢ is a vector perpendicular to dr and therefore to 
the surface. Therefore V¢ is a vector having the direction of the unit normal 


n to #(x, y, z) = constant at (x, y, Z). 


98 vector integration 


EXAMPLE 4.6 
Evaluate the surface integral [f, A dS of the vector A = yi + 2zj — xk 
over that part of the plane 2x + y + 3z = 6 which lies in the first octant. 


- 
< 


mg 
Figure 4.6 
Solution The surface integral is to be evaluated over the triangular area 


ABC (Fig. 4.6). By projecting on to the x,y plane, the surface integral may be 
evaluated as 


| i dx dy 
A.wn 
AOAB (n.k) 


when n is a unit vector normal to plane ABC. Therefore 


2i+j+3 3 
wae bit 3k nk = 


vi4 vi4 


An ane (9g + 2z — 3x) 


14 


vector integration 99 


Therefore 


? 1 dx dy 
Surface integral -{f 10) 22. 3x) ——— 
i ’ aoaB ,/14 a 3/,/ 14 


=1/f (2y + 2z — 3x) dx dy 
3 /JA0AB 


But on the plane, 2z = 4 — 2y/3 — 4x/3, and therefore 


Surface integral = ‘ ff (3 Pe a 4) dx dy 
3 AOAB 3 3 


It remains to determine the limits of integration for x and y. The equations 
of the line AB are 2x + y = 6, z = 0. Keep x constant and integrate along 


PQ, i.e. from y = 0 to y = 6 — 2x. The whole of the area OAB will now be 
covered if we integrate from x = 0 to x = 3. 


3 


6—2a 
Surface integral = : | (4y — 13x + 12) dx dy 
y=0 


a2=0 


3 
= : [2y® — 13xy + 12y]$** dx 


r=0 
1 3 
“" f (144 — 150x + 34x”) dx = 7 
0 


EXAMPLE 4.7 


Evaluate J{,, A.dS where A = xi — yj + 2zk over that part of the surface of 
the parabolic cylinder 4y = x* cut off by the planes y = 4 and z = 5. 


Solution $= x*—4y=0 


C) C) C] 
Vo = |[—i+-—jt+ —k])¢= 2xi —4j-+ 0k 
$ (fi+e ite )s oT 

xi — 2j 1 2 
eee AS 2 
8 aa 


100 vector integration 


4y=x? 
Figure 4.7 


In this example it is clearly not possible to evaluate the surface integral by 
projection on the x,y plane. However 


[ [aes = [ fo 


by projecting on the y, z plane and n.i = x/,/(x* + 4). Therefore 
dy dz 


ig EEE a) 


Surface integral = 


J eg aa + 4) 


-|{ Oy pods 
Sus x 


But on the surface x* = 4y and therefore 


4y + 2y 
=0 29" 


Surface intergal -[ | dy dz 
=0 


5 
= "3 y*dy dz 
z=0 Jy=0 


5 
= [py az 
0 


5 
= [16 az = 80 
0 


vector integration 101 


EXERCISES 4.3 
1, Evaluate [fs A.dS when A = (x* + y)i — 2yj + xzk and S is that part of the 
plane x + 2y + 2z = 10 which is in the first octant. [250] 


2. Ifr = xi + yj + zk, show that {{ r.dS evaluated over all the faces of a unit cube 
bounded by the planes x = 0, x = 1,y = 0, y = 1,z =0,z = 1 is 3. 


3. If A = 2yzi + xzj + zk, evaluate [f A.dS over the closed surface bounded by 
the right cone x? + y® = 4z? and the plane z = 2. [1607/3] 


4. The equation of a surface S is $ (x, y, z) = 0. Show that its area is given by 


a¢\2 0d\2 ad\2)1/2/) 2. 
WN.) * GG) * Ce) 
Sey \O*. y Zz 
Use this result to find the area of the surface of the sphere x? + y? + z* = a’. 


az 
5. Find the area of the surface of the plane 3x + y + 2z = 6 cut off by (a) the 
planes x = 0, y =0, x = 1, y = 2 and (6) the planes x = 0, y = 0 and the 
cylinder 4x? + y? = 4. [(a) \/14, (6) /14/4] 


6. Find the surface area of the region common to the two cylinders 


= 
dx dy 


x+y?=4 and x*4+27=4. [64] 


7. A hole of square cross-section, side 2b, is cut symmetrically through a right 
circular cylinder of radius a > b, the axis of the hole being perpendicular to, 
and two faces of the hole being parallel to, the axis of the cylinder. 

Show that the area removed from the curved surface of the cylinder is 8ab x 
sin“ (b/a). 

8. Show that the area of that portion of the surface of the sphere x? + y? + z? = a? 
within the cylinder x + y? = ax is 2a°(m — 2). 

(Hint: use cylindrical coordinates] 


4.5 Volume Integrals 


Let V be the volume within a closed surface. Integrals of the types 


fff.aay aoa {ff sav 


which are integrated over the volume V, with A and ¢ being respectively 
vector and scalar functions of position, are called volume integrals. 


EXAMPLE 4.8 
A tetrahedron is bounded by the planes z = 0, z = x, y = 2a, y = 2x. If the 


tetrahedron is composed of material of density p(x, y, z) = xy*z°, calculate 
its mass. 


102 vector integration 


Solution Mass = fffoapc p ax dy dz = [ffoano xy?z® dx dy dz. 

AABC is right-angled at A (Fig. 4.8). Let «fy be a slice of the tetrahedron 
of width dy parallel to AABC. 

Consider an element of mass p dx dy dz at the point P(x, y, z). Keeping 
x and y constant, integrate with respect to z so that z goes from 0 to x (the 
equation of By is y = constant, z = x). Then integrate with respect to x so 


plane z=x 


x 
Figure 4.8 


that x goes from 0 to y/2, and finally integrate with respect to y which goes 
from 0 to 2a. Therefore 


2a fu/2 fa 2a fu/2 z! 2 
Mass = | | xy*z° dx dy dz = | I xy'[ 7] dx dy 
v=0 Jx=0 Jz=0 v=0 Jx=0 4 Jo 


wg 2a BR, Hol, q & ile a 84 
vehi ae ee 


9 9 
=—x=x2?xfiF 
Be ag ae? 3 


vector integration 103 


EXAMPLE 4.9 
Find the volume common to the sphere x? + y? + z* = a* and the cylinder 
x? + y? = ax. 


Figure 4.9 


Solution The axis of the cylinder is parallel to Oz and the cylinder intersects 
the plane z = 0 in the circle (x — }a)? + y® = (}a)*, ie. a circle centre 
(4a, 0), radius }a. Using cylindrical coordinates, the element of volume at the 
point (r, 0, z) is r dO dr dz. 


The volume common to the sphere and the cylinder is V = fff r d0 dr dz. 
Keeping r, 0 constant, integrate with respect to z from —./(a? — r*) to 


+,/(a* — r’). 
Now keeping 0 constant, integrate with respect to r from 0 to a cos 0 and 
finally integrate with respect to 0 from —}z to + }7. Thus 


+n2/2 facos®@ [+/(a2—r) 
V =| [ r d0 dr dz 
J 


=—7/2dr=0 =—V/ (a®—r?) 


7/2 facosé 
= af r,/(a® — 1°) dr dO 
i] 


=0~7r=0 


7/2 
= af [—(a® pis py tcoee do 
0 


3 (7/2 3 3 
3 Jo a 9 


104 vector integration 


EXAMPLE 4.10 
Evaluate {{fy AdV where A = xi + j—2yk and where V is the region 
bounded by the planes x = 0, y = 0, z =O and 2x + 2y+z=4. 


Solution 


[[Jaav =a] fax ay az + iff fax dy ae — ax [ff y ax ay ac 


Consider the volume element dx dy dz at the point (x, y, z) (Fig. 4.10). 
Integrate with respect to z from z = 0 to z = 2(2 — x — y). Then integrate 


with respect to y from y = 0 to y = 2 — x, and finally integrate with respect 
to x from x = 0 to x = 2, 


2 2-2 (2(2—a2—-y) 
{f dx dy dz =| | | dx dy dz 
ol z=0 /y=0 Jz=0 
2 2—a 
=2/ (2 —x — y) dx dy 
a2=0 Jy=0 
2 
= 2f Py — xy — ty" dx 
z=0 


2 
=2['@—2x + ax) ax 
0 


= 
2 2—2 (°2(2—a2—y) 
[[[ ae ay az =| x dx dy dz 
‘a 2=0 /y=0 Jz=0 


2 2-2 
=2{ x(2 — x — y) dx dy 
z=0 Jy: 


=0 


2 
= 2f x[2y — xy — 4y*]f-* dx 
0 


r= 


2 
- 2f (2x — 2x* + 4x°) dx 
0 


Ole 


vector integration 105 


a 
Figure 4.10 


2 2-H (°2(2—e—y) 
[] fy aay ae -{ i { y dx dy dz 
Gi z=0 vy=0 Jz=0 


2 2—2 
=2| { y(2 — x — y) dx dy 
z=0 Jy=0 


2 
= 2f [y? — xy? — dy]? dx 
z=0 


ee 2['4e a xy dx = e kereenl 
0 3 4 0 


I 
Ole 


106 vector integration 


(The last two integrals must be equal because the figure is symmetrical with 
respect to x and y.) Therefore 


{| A dV = $i + 2j — 2k) 
Vv 


EXERCISES 4.4 
1. A tetrahedron is bounded by the planesx = 0, y = 0,z = Oand3x + 2y + 2z = 
6. Its density p(x, y, z) = xy. Find its mass. [0.9] 


2. Evaluate fff z(x* + y*) dx dy dz through the volume of the cylinder x? + y? = a" 
intercepted by the planesz = 0,z=h. [}aath?] 


3. Evaluate {ff (xy + yz + zx) dx dy dz through the interior of the cube de- 
termined byO< x<a,0<y<a,0<z<a. [Ba] 


4. Evaluate fff (2x + y)dx dy dz over the region bounded by x =0, y = 0, 
z = 0, y = 2 and the cylinder z = 4 — x*._ [263] 


5. Find the volume bounded by the plane x + y + 2z = 4a and the paraboloid 
x? + y® = 4az. [.25/2)ma*] 


Dynamics of a Particle 


Let us now consider the applications of vector calculus to dynamics. 


4.6 Linear Momentum; Impulse; Activity; Kinetic Energy 


The /inear momentum p of a particle of mass m moving with a velocity vy is 
defined by 


By Newton’s second law of motion, the force F acting on the particle is equal 


to the rate of change of linear momentum which it produces, i.e. 


cE gaat mo sk (4.13) 


where a is the acceleration of the particle assuming m is constant. 
Let the position vector of the particle be 
r= xi-+ yj + zk 


and let forces F,, F,,...,F,,...,F, each act on the particle where F, = 
X,i+ Y,j+ Z,k. The resultant force is 


vector integration 107 


and the equation of motion is R = mi, i.e. 


S (Xa + Ys+ Zk) = m6si + 55+ 1) 
Therefore 

SYi=mé LTY=—m > Z,=mt (4.14) 
Suppose a force F acts on a particle of mass m for an interval of time fy to t, 


during which the velocity of the particle changes from vg to v,. The impulse 
I of the force F is defined to be the change of linear momentum produced, i.e. 


I = m(¥, — Yo) (4.15) 


If F is a variable force, 


a th 
to to 


so that the impulse of a force is its time integral over the interval during which 
it acts. If F is a constant force, I = F(t, — ty). 
A large force acting for a short time is called an impulsive force or simply 
an impulse, and is measured by the change of momentum which it produces. 
Let a particle be given a displacement dr in a time df under the action of a 
force F. The work done by F in the time dt is F.dr. Therefore average rate of 
working of F in the interval dt is 


F.dr 
ot 


In the limit as 6t > 0, the instantaneous rate of working or the activity of 
the force F is 


Fit = Fv (4.16) 


It follows that the work done by the force F in the interval fy to t, is given by 


ti 
| F.v dt (4.17) 
t 


0 


The kinetic energy T of a particle of mass m moving with a velocity v is 


108 vector integration 


defined by 
T = }nv? = nv? (4.18) 
ar <i s Gta my = max = Fv (4.19) 


which is the activity of the force acting on the particle. Thus the rate of 
increase of the kinetic energy of the particle equals the rate of working of the 
force acting upon it. It follows that the increase of kinetic energy in a finite 
interval of time is equal to the work done by the force acting upon it during 
that interval. This statement is true whether the force is constant or variable 
and is called the principle of energy. 


4.7 Motion of a Particle under Gravity 


Let a particle be projected from the point O at time t = 0 with a velocity 
Vo = Ui + vj. Take axes of coordinates Ox, Oy such that V, is in the plane 


Figure 4.11 


xOy and Ox is horizontal. Let r be the position vector of the particle at time 
t. Then, if g is acceleration due to gravity vertically downwards, 


Therefore 
Vv =f = —gtj + Vj satisfying the condition y = V, when t = 0 
r= xi+ yj = —igt*j + Vot satisfying the condition r = 0 when t = 0 


= —dgt*j + t(uoi + vj) 


vector integration 109 


Therefore 


xX=ut and y = vot — $f? 


2 
These are the parametric equations of the parabola y = me 4g nal 
which has a vertical axis of symmetry. uo Uo 
4.8 Moment of Momentum (Angular Momentum) 

Let r be the position vector of a particle of mass m moving, at a given instant, 
with a velocity y. The velocity vector v of the particle, and consequently its 


momentum vector mv, will be localized vectors coincident with the tangent 
to the path of the particle at R (Fig. 4.12). 


my 


MAR 


oO 


Figure 4.12 


The moment of the momentum vector mv about O is 
H=rx my (4.20) 


and is also called the angular momentum of the particle about O. 
Let F be the resultant force acting on the particle. The moment or torque 
of F about O is r x F. Now 


(r xX my) =f X mv+rxmvV=vVx mv+rx mv 


dH _d 
dt t 


=rxmv=rxF 


110 vector integration 


Therefore the rate of change of the angular momentum of the particle about 
O equals the moment about O of the resultant force acting on the particle. 
This is the Principle of Angular Momentum. 

If the resultant force has a zero moment about O, the angular momentum 
of the particle about O will remain constant. This is the Principle of the 
Conservation of Angular Momentum. 


4.9 Central Forces 


Let the position vector of a particle R of mass m with respect to a fixed point 
O be r = OR (Fig. 4.13). The equation of motion of the particle under the 


Figure 4.13 


action of a force of magnitude f(r) acting in the line OR is 
mé = f(r)r, (4.21) 


where r, is a unit vector having the direction of r. 

If f(r) < 0, then # has a direction opposite to that of r, and the force is 
directed towards O so that the particle is attracted towards O. On the other 
hand, if f(r) > 0, the particle is moving under the action of a repulsive 
force from O. 

Such forces, always acting in the line OR between a fixed point O and the 
particle R and having a magnitude which is a function of r only, are called 
central forces having O as the centre of force. 

Since the moment of a central force about O is zero, it follows from 
section 4.8 that the angular momentum of the particle about O remains 


vector integration 111 


constant. Therefore 
rx my=c aconstant vector 


rxv=h_ aconstant vector (4.22) 

= puk (4.23) 

where k is a unit vector having the direction of r x v and therefore per- 
pendicular to the plane of r and v, and p is the perpendicular distance of v 


from O. It follows that the path of the particle lies in a plane determined by 
the initial vectors r and vy. 


By Example 3.3 (p. 71), ¥ = fry + rOs,, therefore by eqn (4.22), 
r x (it, + rés,) =h i.e. r26(r, x s,;) =h 
rok = h = puk (4.24) 
r°§ = pv =h (4.25) 
Now suppose that the radius vector r = OR turns through an angle 60 


in a time 6t. It sweeps out an area 4r? 60 in this time. Therefore the rate of 
description of area by the radius vector is 


_ 4°60 
lim 


5t0 t 


= hr°6 = th = 3po (4.26) 


Thus when a particle moves under a central force, the rate of description of 
area by the radius vector is constant and equal to 3h. 


4.10 Planetary Orbits 


Newton’s Universal Law of Gravitation states that two bodies of masses 
M, m are mutually attracted with a force 


p — GM 
r 
where r is their distance apart and G is a universal constant. Let M be the 


mass of the sun and m that of a planet, e.g. the earth. Then, if the attractions 
due to the other planets are assumed to be negligible, the equation of motion 


112 vector integration 


of the planet is 
mr => ———r, Le: r= —— YT; (4.27) 
r 


r being the position vector of the planet with respect to the sun as origin and 
r, a unit vector having the direction of r. By section 4.9, the rate at which the 
radius vector sweeps out area is constant. This is one of Kepler’s Laws of 
Planetary Motion. By eqns (4.22) and (4.24) we have 


h=rxv=r6k 


Now 
fv xb) = xh=exb=— My, x HK 
dt r 


—GM6(r, x k) = GMés, = GMO 
d6 dt 


Therefore 
vxh=GMr,+¢c (4.28) 
r.(v x h) = (r x y).h = GMr.r, + cr 

Therefore 
h.h = h? = GMr + ercos 0 


where 0 is the angle between the directions of ¢ and r. Therefore 
ut 4c. r(1 4 oth ee 6) (4.29) 
GM 


Now writing GM = mw, 1 = h?/u and e = c/u, the equation of the orbit is 
1=r(1 + ecos 6) (4.30) 


which is the polar form of the equation of a conic of eccentricity e and 
semi-latus rectum / referred to its focus as origin. This equation represents an 


vector integration 113 


ellipse, parabola, or hyperbola according as e is less than, equal to, or greater 
than unity. The orbits of the planets are closed curves and are therefore 
ellipses with the sun at a focus. 

Let a, b be the semiaxes of the ellipse; its area is then zab. The rate of 
description of area by the radius vector drawn from the focus O (Fig. 4.14) 


Vo B 


Figure 4.14 


is th. Therefore the time taken by the planet to complete one circuit of the 
orbit, the periodic time T is given by 


ee eee ale 14.3 
P= Fh ~ Jed ~ Ve @) ae we 


Thus the squares of the periodic times of the planets in their motion relative 
to the sun are proportional to the cubes of the major axes of their orbits. 


This law was stated by Kepler following observations of the planets. 
From eqn (4.27) we have 


eal See ee 
r? r° dt 


Integrating with respect to rt, we have 


Gf = p= a +-C (4.32) 
r 


Let vy be the planet’s velocity at B, the end of the minor axis, i.e. when r = a. 
Now, if p is the length of the perpendicular from O to the direction of v9, by 
eqn (4.25), we have 


pv=h thus bv, = /(ul) = a |(é “) and vy)=,/(u/a) 


114 vector integration 


Thus v = ,/(u/a) when r = a. On substituting in eqn (4.32) we have 


2 —e 
a a a 


Substituting in eqn (4.32), the velocity equation for the orbit is 


we (2 ae (4.33) 


a 
4.11 Motion under a Central Force Directly Proportional to 
Distance 
Let a particle of mass m be moving under the action of a force mur directed 
towards a fixed point O, r being the position vector of the particle with 


respect to O (Fig. 4.15). The equation of motion of the particle is 


mr = —mpr ie. tf = —pr (4.34) 


2r.r = —2yur.r 


Figure 4.15 
Integrating with respect to f, 
(1)? = o?§ = —pr?+C 


If p is the perpendicular from O to the velocity vector vy at R, we have pv = h. 
Therefore 


vector integration 115 


Now the p,r equation of an ellipse of semi-axes a,b referred to its centre is 


1 @+h-—r 
P ite a*b? 


Hence the path of the particle is an ellipse, centre O such that 


= = a hence h = ab,/u 
and 

- a eee hence C = p(a* + b”) 
Therefore 

vt = C — wr? = p(a? + 5 — r*) (4.35) 
i.e. 

v= J/uJ/@+ b—r*) = OS/u (4.36) 


where OS is the semi-diameter conjugate to OR and parallel to y. The rate of 
description of area by the position vector r = $4 = }ab,/u so that the 
periodic time T is given by 


fp. (4.37) 


abu a 


and is therefore independent of the size of the orbit. 


EXAMPLE 4.11 


A particle is projected with a velocity of 120 m/s in a direction inclined at an 
angle tan~! 3/4 with the horizontal. Find its range on a horizontal plane 
through the point of projection and the time of flight. (Take g = 9.8 m/s?.) 


Solution 
= si him ee 
When t = 0, 


r= 120(4i+ 3j)=a and a= 96i+ 72j 


116 vector integration 


Figure 4.16 


Substituting for a, 
tf = 96i + (72 — gt)j 
r= 96ti+ (72t — 4gt*)j 
satisfying the initial condition r= 0 when t=0. Now r= OA when 


72t — }gt® = 0, i.e. when t = 144/g = 14.69 sec, so that the time of flight 
is 144/g sec and the range is 


OA = |r|,~144/, = 96 X 144/g = 1 411 m approx 


EXAMPLE 4.12 

A particle moves under the attraction of a force y/r* per unit mass directed 
towards a fixed point S. The distance of the particle from S is r (Fig. 4.17). 
The particle is projected from a point P, where SP = a, with a velocity 
J (u/a) in a direction which makes an angle }7 with SP. Show that it moves 
in an elliptic orbit of eccentricity 1/,/2 in a periodic time (27a*/)/)/2, 


Figure 4.17 


vector integration pug 


Solution 
h = rk = Gin ak = /(}ua)k 
Now & = —(u/r?) r,. Therefore by eqn (4.28), 
vxh=uypr+c¢ 


Initially v = ./(u/a)(cos farty> + sin 427819) = ./ (4/24) (Fo + S19) Where ryo 
and s,9 are the radial and transverse unit vectors at P. Therefore, initially, 


v x h = fu (to + S10) * K = 34(—Syo + Tro) = Tio + € 
C= —fulFiot So)  ¢ = le] = 34/2 = w/,/2 


Hence, the eccentricity of the orbit is 


e= clu = 1//2 
Thus the orbit is an ellipse of eccentricity 1/,/2. 
Now 
yl _ lua _a 
BH wr 2 


Let A and B be the semi-major and semi-minor axes respectively, then 
aq2=l= BA=Al—e2)—Al2 thus A =a 
By eqn (4.31), the periodic time is 27A*/2/u!/2 = (27a*/?)/u2/?. 


EXAMPLE 4.13 

A particle of unit mass is describing an elliptic orbit with semi-axes 2a and a 
under an attraction yr to the centre C when it is at a distance r from C. If the 
intensity of the attraction is suddenly increased in the ratio 4:1 when the 
particle is at a distance 3a/2 from C, show that the semi-axes of the new 
orbit are 4a(,/79 + ,/15). 


Solution By eqn (4.35) 
v? = u(a® + b* — r?) = w(Sa* — r*) 
Let v, be the velocity when r = 3a/2 (Fig. 4.18). Therefore 


v,? = (Sa? — 9a?/4) = 11uat/4 
be 


118 vector integration 


Figure 4.18 


When the particle is at P, the attraction is suddenly quadrupled so that “py” 
becomes 4 and the velocity equation for the new elliptic orbit is 


v® = 4u(A? + B* — r?) 


where A, B are the semi-axes of the new orbit. 
When r = 3a/2, we have v = »,, therefore 


l1 02/4 = 4u(A? + B?— 9a%/4) — A® + B® = 470/16 


When the force is suddenly quadrupled at P, the moment of the velocity 
about C is not changed. Therefore 


h=2axal/p=2(/pAB AB=a° 
Therefore 

(A + B)? = #4a? + 2a = 78a? and +a? 
ie. A+ B= ja/79 and A—B=}a/I15 


A =}a(J/79 + ./15) B= 4a(,/79 — /15) 


EXERCISES 4.5 
1. The acceleration of a particle P moving on the curve r = ae*® always has the 
direction of the line OP where O is the origin. Show that the magnitude of the 
acceleration is proportional to r~*. 
2. A particle is projected with a velocity of 130 m/s at an angle tan~ 5/12 with the 
horizontal. Show that its range on a horizontal plane through the point of 
projection is 12 000/g metres and the time of flight is 100/g sec. 


vector integration 119 


If the particle is projected directly up a plane which passes through the point 
of projection and is inclined at an angle tan~* 1/4 above the horizontal, show 
that it strikes the plane after 40/g sec, with a velocity 10,/145 m/s whose direction 
is inclined at tan~ 1/12 to the horizontal. Show also that the range on the plane 
is 1 200,/17/g metres. 


3. Two particles are projected simultaneously with velocities u and v from points 
A and B respectively, each directly towards the other. Show that the particles 
will collide at a point vertically below a point C of AB such that AC/CB = u/v. 


4. A particle is attracted towards a fixed point S by a force 3ku*/r* per unit mass 
(k, u constants) where r is its distance from S. If the particle is projected from a 
point A, distant k from S, with a velocity u whose direction makes an angle 
x/3 with SA, show that it will describe an ellipse of eccentricity e (7/12). 


5. A particle is describing an elliptic orbit about a centre of force at a focus S. At 
a point P in its orbit, distant c from S, the tangent to the orbit is inclined at an 
angle sin~! 3/5 to SP. At a point Q on the orbit, distant 3c/2 from S, the speed 
of the particle is half its speed at P. If the intensity of the attraction per unit mass 
at unit distance is 1, find the speed at P. Show that the major axis of the orbit is 
9c/5 and find its eccentricity. [34/(2/c), /G2)] 

6. A particle moves under a central force inversely proportional to the square of 
the distance from a fixed point S, the accelerating effect of the force at a distance 
of 1 m being g = 9.8 m/s®. If the particle is projected in a direction perpendicular 
to a radius at a distance 3/4 m from S with a velocity of 4 m/s, determine the 
major axis and the eccentricity of the orbit. What value must the velocity of 
projection exceed to make the orbit a hyperbola? [0.97 m, 0.224, 5.11 m/s] 


7. A point P moves so that its velocity is the vector sum of two components of 
constant magnitudes u, v, the first being in a fixed direction and the second 
perpendicular to the line joining P to a fixed point S. Show that the orbit is a 
conic of eccentricity u/v. 


8. A particle of mass m is describing an ellipse of major and minor axes 2a, 2b 
respectively about a centre of force at the centre. When it reaches the end of the 
major axis, it strikes and coalesces with a particle of mass nm at rest. The central 
attraction per unit mass is unchanged. Prove that the new orbit is an ellipse of 
major and minor axes 2a, 2b/(n + 1) respectively. 


5 Probability theory 


5.1 Introduction 


The concept of probability, which has played a fundamental part in the 
development of the theory of statistics, arose from the consideration of games 
of chance which are essentially experiments of a repetitive nature. In any 
game of chance, such as tossing a coin, throwing a die or drawing a card, the 
outcome of any particular trial is uncertain. Nevertheless, experience has 
shown that games of chance and many repetitive industrial operations and 
scientific experiments behave, in the long run, as if they were essentially 
stable. For example, an unbiased coin would show heads in about one-half of 
a large number of tosses. Again, whilst an insurance company could not 
predict which particular man would die at the age of 60, it could predict 
what proportion of men would die at that age. 

To enable the statistician to predict the outcomes of future trials of a 
repetitive experiment and to study its properties, it is essential that he should 
construct a mathematical model applicable to the experiment. 


5.2 Outcomes and Events 


Any repetitive experiment has a number of alternative outcomes which are 
mutually exclusive. For example, when a coin is tossed, there are two al- 
ternative outcomes—head or tail (H or 7). If the coin is unbiased, H or T 
may be expected to fall with approximately equal frequencies in a large 
number of trials. Consequently, the probability of each of the outcomes H 
or T is said to be 1/2. 


probability theory 121 


If two coins are tossed (or one coin is tossed twice), there are four possible 
outcomes which are mutually exclusive, namely 


Jebel eke Se G3 im ip & 


which are equally likely. Consequently, for balanced coins, each of these 
outcomes is given the probability 1/4. 

When an unbiased die is thrown, there are six possible mutually exclusive 
outcomes, 1, 2, 3, 4, 5, or 6, the probability of each outcome being 1/6. If two 
true dice are thrown, there will be 6 x 6, i.e. 36 outcomes: 


Ciay 6012), pee od): DOE 
iy, 2) eee "2, 4) OS), 
G1) 62 63) 64 65 G64 
(4,1) (42) (43) 44 45 4,6) 
aiyrre(S, 2) OU GPA GS) 5) ©'S;'6) 
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6) 


each outcome occurring with a probability of 1/36. 
A single outcome is called a simple event or an element but an event may 
comprise either one outcome or a group of outcomes, for example 
(a) When two coins are tossed, the event in which one head and one tail 
occur together comprises the two outcomes HT and TH, whilst the 
event in which two tails occur consists of the single element 77. 
(b) When two dice are thrown, the event in which two fours occur consists 
of the single outcome (4, 4) whilst the event, in which a total of eight is 
thrown, comprises the five elements 


(6,2) (5,3) (4,4) G5) @,6) 


From such considerations, we derive the following classical definition of 
probability: 


DEFINITION 5.1 

If a repetitive experiment has n mutually exclusive and equally likely out- 
comes*, and if 74 of the outcomes comprise an event A, then the probability 
of the event A is n4/n. (0 < nyln < 1) 


* Probability distributions for which all the outcomes are equally likely are called 
“Laplace distributions”. 


122 probability theory 


With reference to (a) and (b), it follows that: 

in (a) the probability of an event comprising one head and one tail is 

2/4 = 1/2 

and of the event comprising two tails 1/4 

and in (b) the probability of throwing two fours is 1/36 and of throwing a 

total of eight 5/36. 

In applying the above definition, it is important to bear in mind the 
requirement that outcomes must be equally likely and mutually exclusive. 
Note, for example, that the probability of drawing either a king or a diamond 
from a pack of cards is not 17/52 but 16/52 since one of the four kings is also 
one of the thirteen diamonds. 

Probabilities derived from Definition 5.1 are called ‘‘a priori” probabilities 
being obtained by deduction from idealized models, e.g. from a “‘fair’’ die. 

As already observed, the outcomes of many repetitive experiments show 
stability in the long run. Suppose that a series of experiments be performed 
under conditions which are as similar as possible and that the frequency of the 
occurrence of an event A be recorded. 

Let us now postulate that the relative frequency of the event A is an 
approximation to the probability of the event 4. A probability measure 
determined in this manner is known as an “a posteriori” probability or a 
statistical probability. 


5.3 Sample Points and Sample Space 


Suppose that every possible outcome of an experiment can be enumerated 
and a probability assigned to each outcome. It is convenient to represent 
each outcome by a point specified in appropriate coordinates—such points 
are called sample points to each of which a probability may be assigned, as 
in the following examples. 

The two outcomes H and T which result from the tossing of a true coin 
may conveniently be represented by two sample points on a straight line, 0 
for the tail and 1 for the head. 

Similarly, the six outcomes arising from throws of a symmetrical die may 
be represented by six sample points 1, 2, 3,..., 6 on a line each having a 
probability 1/6. 

The four outcomes HH, HT, TH, TT resulting from throws of two 
unbiased coins may be conveniently represented by the four sample points 
(1, 1), (1, 0), (0, 1), (0, 0) in two-dimensional cartesian space, each having 
a probability 1/4 (Fig. 5.1). 

If three balanced coins are tossed, the eight outcomes could be represented 
by eight sample points in three dimensional space, each point having a 
probability 1/8. 


probability theory 123 


$0.1) si) 
TH HH 
f oy 
(0,0) (1,0) 
1T HT 
Figure 5.1 


When two fair dice are thrown, the 36 outcomes (Section 5.2) may be 
represented by the 36 sample points illustrated in Fig. 5.2, each bearing a 
probability 1/36. 

In the above examples, the outcomes are “equally likely” so that equal 
probabilities would be assigned to the associated sample points. In general, 
however, the probabilities assigned to sample points on the basis of the 
relative frequencies of the various outcomes in a large series of trials would 
be unequal. 


emer)! Se 
as + +) ee 
‘i. ras. 
4} go. 6 Oe 
+ x eee 
A) «eee 


Figure 5.2 


124 probability theory 


DEFINITION 5.2 
The set of sample points representing all possible outcomes of an experiment 
comprises the sample space of that experiment. 

A sample space is said to be discrete or continuous according to whether 
it contains a finite number or a continuum of sample points. 

The probability assigned to any sample point is necessarily positive and 
the sum of the probabilities of all the sample points comprising the sample 
space of an experiment is 1. 


Let the probabilities assigned to the m sample points representing the n 
outcomes of an experiment be p;, Po, ..-., P, then 


n 
O<p<1 and Yp,=1 (i = 1,2, 3,...,7) 
i=1 


5.4 The Probability of an Event 


Suppose that for each outcome of a repetitive experiment, it can be decided 
unambiguously whether or not an event A has occurred so that each sample 
point is one for which the event A has or has not occurred. The probability 
of the event A is defined as follows: 


DEFINITION 5.3 

The sum of the probabilities assigned to all those sample points associated 
with the occurrence of the event A is the probability p(A) of the event A. 
Let the Venn diagram (Fig. 5.3) represent the sample space for such an 
experiment and let the set A comprise those sample points associated with 
the occurrence of the event A. The sum of the probabilities assigned to these 
points gives the probability p(A) of the event A, that is 


P(A) = 2 Pi 


Sample space 


Figure 5.3 The set A of sample points associated with the event A. 


probability theory 125 


where the sum is taken over the sample points in set A. It is clear that 
0<p(4)<1 
and the probability that the event A does not happen is 


p(A) = 1 — p(A) 


Event A 


Figure 5.4 


Figure 5.4 shows the 36 sample points which comprise the sample space 

for throws of two symmetrical dice. The probability of 
(i) Event A, the throwing of a total of 7, is 6/36 = 1/6. 

(ii) Event B, the throwing of a difference of 1, is 10/36 = 5/18. 

The outcomes (4, 3) and (3, 4) are common to the events A and B. Conse- 
quently, the probability that either the event A or the event B or both occur 
is 14/36 = 7/18, whilst the probability that both occur, i.e. the probability 
of a throw for which the sum is 7 and the difference is 1, is 2/36 = 1/18. 


5.5 Combinatorial Formulae 


To facilitate the computation of the probability of an event by counting the 
sample points associated with the occurrence of that event in accordance 
with Definition 5.1, the combinatorial formulae of Algebra are applied. 
The development of these formulae is based upon the following axioms. 
(1) If two mutually exclusive events A and B can occur in m ways and in 


126 probability theory 


n ways respectively then 
(i) either the event A or the event B can occur in (m + n) ways 
(ii) the events A and B can occur simultaneously in mn ways. 
For example, let the mutually exclusive events be 

A the drawing of an ace from a pack of cards—this can be done in 4 ways 

B the drawing of a king from a pack of cards—this can also be done in 
4 ways. 

(i) If one card only is drawn, the number of ways in which either an ace 
or a king may be drawn is 4 + 4, i.e. 8 ways. 

(ii) If two cards are drawn, the number of ways in which one ace and one 
king may be drawn is 4 x 4, i.e. 16 ways, since any one of the four 
aces may be drawn with any one of the 4 kings. 

(2) For three mutually exclusive events A, B and C which can occur in m, n 
and p ways respectively 

(i) either A or B or C can occur in (m + n + p) ways 

(ii) A, B and C can occur simultaneously in mnp ways. 

These axioms may be generalized, in an obvious manner, to apply to more 
than three events. 


Let us now determine the number of ways in which a group of n different 
objects may be arranged. Any such arrangement of the n objects in a definite 
order is called a permutation. 

Any one of the n objects may be placed in the first position which may 
therefore be filled in n ways. The filling of the first position is thus an event 
which may occur in m ways. Having placed any one object in the first position, 
(n — 1) objects remain so that the filling of the second place—the second 
event—may occur in (m — 1) ways. The first and second positions may 
therefore be filled in n(m — 1) ways. Having placed any two objects in the 
first two positions, (m — 2) objects remain and the third position may be 
filled in (n — 2) ways and so on. 

It follows that the number of permutations of n objects amongst themselves 
is 


n(n — 1)(n— 2)---2x1=n! (5.1) 


where n! is called factorial n. 

Let us now determine the number of permutations of n objects when 
only r (<n) of the objects are used in any permutation. As before, the various 
positions may be filled successively in n, (n — 1), (n — 2),... ways but 
when we reach the final, i.e. the rth position in the permutation, (r — 1) 
objects have been placed so that m — (r — 1) objects remain. Consequently, 
the rth position may be filled in { — (r — 1)} ways. 

Therefore the number of permutations of n different objects taken r at a 


probability theory 127 


time, which will be denoted by the symbol "P,, is given by 


n! 
(n —r)! 


"P= n(n — 1)(n—2)**'(n—1+1)= (5.2) 


The number of combinations of n different objects taken r at a time is defined 
to be the number of different selections, each comprising r objects, which 
can be chosen from n objects without reference to their order within a 


selection. This quantity will be denoted by (") . (The symbol "C, is also used.) 


Two combinations are different when they do not contain the same 
objects, e.g. abe and abd are different three-letter combinations but abc, acb, 
bca, bac, cab, cba, are the 6 (i.e. 3!) permutations of the same combination. 

Each combination of r objects may be permuted in r! ways. Thus 


(") xrlie= BPZ 
" 


r! r| ri(n—r)! 


It easily follows from (5.3) that 


ge i 


This result is otherwise obvious since eqn (5.3) may be interpreted as the 
number of ways in which n different objects may be divided into two groups 
containing respectively r and (n — r) objects. 

Let us now determine the number of ways in which n objects may be 
divided into three groups containing respectively 7,, m2 and n, objects such 
that n, + mz +3 =n. First, divide up the n objects in two groups con- 
taining m, and (7m, + mg) objects respectively. 

By (5.4), this may be done in (;") ways. 

1 

Now divide any group containing (7, + 13) objects into two groups 

containing respectively n, and n, objects. 


Ng + Ng 
n 


This may be done in ( ways. 


2 
It follows that the number of ways in which the complete process of 
subdivision, resulting in three groups containing respectively m, nm, and 71 


128 probability theory 


objects, may be performed is 


i ies my. n! (ny + ms)! n! (5.5) 
ny Ng 14! (ng + ng)! ng! ng! n,! ng! ng! ; 


By extending the above argument, it easily follows that the number of 
ways in which n objects may be divided into k groups containing respectively 
Ny, Ng, Mg,... , 1, Objects such that nm, + m+ ng +°-: +n, =n is 


ml (5.6) 


n,! Ng! n,! died n,! 


Equation (5.6) may be also interpreted as the number of permutations of n 
objects containing k groups comprising n, identical objects of one type, m2 
identical objects of a second type and so on to 7, identical objects of a kth 
type. 
The total number of different permutations, say P, in this case would be 
less than n!, the number of permutations if all the objects were different. 
Each of these P permutations would give rise to additional permutations if 
objects of the same type were made different, If, for example, the n, objects 
of the first type were made different, they could be rearranged in n,! ways 
within each of the P permutations giving rise to n,!P permutations. Similarly 
if the n, objects were all made different, m.! times as many permutations as 
before, i.e. m,! n,! P permutations, would result. Continuing this procedure, 
until all the objects have been made different, the total number of permu- 
tations would be 
n!} 


nin !...m!P=n! Le, P= 
<n ri my! ng!... ny! 


EXAMPLE 5.1 
A set of m parallel lines intersect n parallel lines, having a different direction, 
to form a network of parallelograms. How many parallelograms are formed ? 


Solution On any one line of the second set, there are m points of intersection 
m 
( 
on any line of the first set, there are (;) pairs of points of intersection by the 
second set. Therefore ein al 
The number of parallelograms in the grid = ( )( ) = mn(m — 1) x 
(n — 1)/4. 2} \2 


by lines of the first set. The number of pairs of such points is ). Similarly, 


EXAMPLE 5.2 
If p, and p, are the probabilities of two independent events, show that the 
probability of the simultaneous occurrence of these two events is p,pe. 


probability theory 129 


In 18 games of chess, A wins 8, B wins 6, and 4 are drawn. A and B play a 
tournament of 3 games. On the basis of these data, estimate the probability 
that 

(a) A wins all three games (b) A and B win alternately 

(c) Two games are drawn (d) A wins at least one game. 

What are the odds against A losing the first two games to B? (A.E.B. 
A-level) 


Solution 
The probability that A will win, p(A) = 8/18 = 4/9 
The probability that B will win, p(B) = 6/18 = 1/3 
The probability that a game will be drawn, p(C) = 4/18 = 2/9 


(a) The probability that A wins the first, second and third games is 
P(AiA245) = p(Ar)p(A2)p(As) = (4/9) = 64/729 

(b) The probability of the mutually exclusive events A,B,A, and B,A,B; is 
P(A,BeAs) + p(B, A2Bs) = (4 X 1 X 49+ x £ x J) = 28/243 

(c) The probability that two games are drawn and one game is not drawn is 
3 x p(D, D, Ds) = 3 x )? X F = 28/243 

(d) The probability that A wins at least one game is 
1 — p(A does not win a game) = 1 — p (draws and wins by B only) 

= 1 — (8)* = 604/729 


The probability p(4,4,) that A loses the first two games to B includes the 
probability that A loses the first two games and wins the third game plus the 
probability that A loses the first two games and draws the third plus 
the probability that A loses all three games. 


P(A A2) = p(B,B,As3) + p(B,B,D3) + p(B, B,Bs) 
= (4°73 + 54+ = 


Thus, the odds against A losing the first two games is 8:1. 


130 probability theory 


EXAMPLE 5.3 
5-card hands are drawn from a pack. What is the probability of drawing (a) 
precisely two aces, (b) at least two aces? 


Solution In drawing 5-card hands, the possible number of outcomes is (5) 
(a) Number of aces in the pack = 4 3 
Number of ways in which a pair of aces may be drawn = 


The other three cards in a hand occur as selections of 3 drawn from the 
other 48 cards and therefore may be drawn in (t ways. 


Therefore 5-card hands containing 2 aces occur in (3) e ) ways. 
Therefore, required probability 2 


(5) 


4! | 48! 

212! 31451 2 162 

win saunas 54 145 
5147! 


(b) The hands must contain 2, 3 or 4 aces. The alternatives to this are that 
the hands contain either 0 or 1 ace. 


Led . {48 
Hands containing no aces occur in 5 ways 


Hands containing 1 ace occur in (;) (7) ways 


Therefore probability of the alternatives 


4 
(3) + ()(@) 
= nal ic tau = 51 888/54145 
5 
Probability of at least two aces = 1 — 51 888/54 145 = 2 257/54 145 


EXAMPLE 5.4 


During one term, a college team plays 6 cricket matches which result in 3 
wins, 2 losses and 1 draw. In how many different ways is this possible? 


probability theory 131 


Solution The required answer is the number of ways in which the 6 opposing 
teams may be partitioned into 3 groups containing respectively 3 losing 
teams, 2 winning teams and 1 drawing team. Therefore 


Required number of ways = 6!/(3! 2! 1!) = 60 


EXAMPLE 5.5 

6 balls are thrown into 4 boxes so that each ball falls into one of the boxes 
and is equally likely to fall in any one of the boxes. Find the probability that 
the fourth box contains precisely two balls. 


Solution Each ball may fall, with equal probability, in any one of the four 
boxes. 

For 6 balls, the total number of outcomes is 4°. 

The number of ways in which two balls may be selected for the fourth 
box is (;): 

For any such selection, the remaining 4 balls may be distributed amongst 
the remaining 3 boxes in 3* ways. 

Therefore 6 

The number of favourable outcomes = oy 


Required probability = (3) 34/46 = 0.2966 approx. 


EXERCISES 5.1 


1. How many different groups of results are possible for 10 football matches? 
[59 049] 

2. Show that n persons may be seated around a circular table in (n — 1)! ways. 

3. How many integers are factors of the number 2° x 34 x 5* x 7 x 11 not 
counting 1 or the number itself? [418] 

4. In how many ways can a pair of triangles be drawn with 6 given points as 
vertices, no three of the points being collinear? [10] 

5. In how many ways may 5 red counters, 4 white and 2 black be arranged in 
arow? [6930] 

6. Two dice are thrown. What are the probabilities that the total score will be 
(a) 5, (b) 0, (c) 10, (d) 14, (e) less than 13? What is the probability of a score of 
either 6 or a double? [(a) 1/9, (6) 0, (c) 1/12, (@) 0, (e) 1; 5/18] 

7. What is the probability of throwing not more than 4 with a die? When two 
dice are thrown, what is the probability of scoring either 7 or 11? [2/3, 2/9] 

8. 4 cards are drawn at random from a pack of cards. What is the probability that 
they are honours? (A, K, Q, J, 10 are the honours cards.) [57/3 185] 


132 probability theory 


9. What is the probability that 5 cards drawn from a pack are all of the same suit? 
[33/16 660) 

10. A hand of 5 cards is dealt from a well-shuffled pack. What is the probability 
that the hand consists of 5 cards in sequence but not necessarily of the same 
suit? [Hint: There are 9 sequences in 13 different cards.] [192/54 145] 

11. A bag contains 9 balls of which 2 are red, 3 white and 4 black. 3 balls are drawn 
at random from the bag. What is the probability that 
(a) the 3 balls are of different colours? 

(6) 2 balls are of the same colour and the third of a different colour? 
(c) the 3 balls are of the same colour? [(a) 2/7, (6) 55/84, (c) 5/84] 

12. A bag contains 3 white, 4 red and 6 black balls. Two balls are drawn. How 
many possible outcomes are there? In how many of these will the two balls be 
of the same colour. What is the probability that the two balls will be of different 
colours? [78, 24, 9/13] 

13. Find the number of permutations of the following symbols 
(a) A, B,C, D, E (6) A, A,C,D,E (c) A, A, B, B,B 
[(a) 120, (6) 60, (c) 10] 

14. 15 students register to take a course which is provided at 3 different times. 

(a) In how many different ways could the students be assigned to three classes. 
(6) In how many different ways can the students be divided equally between 
3 classes? [(a) 315, (b) 756 756] 

15. 9 students are to be assigned to 3 rooms, each capable of accommodating 3 
students. In how many ways may this be done? If 2 particular students refuse 
to share a room, in how many ways may the students be accommodated ? 

[1 680, 1 260] 

16. 10 applicants for an appointment are interviewed by 3 persons who inde- 
pendently place the candidates in order of merit. It is decided to appoint that 
candidate who is placed first by at least 2 of the 3 interviewers. Estimate the 
probability of the appointment of some particular candidate. [0.28] 


5.6 Expectation or Expected Value 


Suppose that a variable x assumes n values x, with the respective proba- 
bilities p; (i= 1, 2,...,m) in a mutually exclusive outcomes of a trial. The 
n pairs of values x,, p, constitute the probability distribution of the discrete 
random variable x. 

If x is a continuous variable, one cannot speak of the probability of a par- 
ticular value of x. Instead we consider the probability that the value of the 
variable lies in an infinitesimal interval x — 4dx to x + 4dx and restrict our 
attention to those cases in which this probability is expressible in the form 
p(x)dx where p(x) is a continuous function. p(x) is called the probability 
density function (p.d.f.) or probability function. 


probability theory 133 


DEFINITION 5.4 


The expectation or expected value of the discrete random variable x is defined 
to be 


E(x) = 2 DiX; (5.7) 


If x is a continuous variable having a continuous probability density function 
p(x), the expectation of x is defined to be* 


Bike 1 xp) de (5.8) 


For any such probability distribution, the expected value E(x) may be 
interpreted as the average or mean value of x and will be denoted by yu. 


EXAMPLE 5.6 


(A) In a gambling game, a player rolls a die and is paid as many pounds as 
the number he throws. What entrance fee should he pay for a fair game? 


Solution The player would expect to win 
£{4(1) + $2) + +++ + $(0)} = £3.50 


so that he would expect to cover an entrance fee of £3.50, in the long run, by 
his winnings. 


(B) An urn contains 3 red balls and 10 black. 4 balls are drawn together. Find 
the probability distribution of x, the number of red balls drawn, and the 
expected value of x. 


Solution When 4 black balls are drawn, x = 0. Therefore 
Probability that x = 0 is 


(9)/(2) 20 


Probability of drawing 3 black balls and 1 red (i.e. of x = 1) is 


(3) ()/(a) = 720 


* If p(x) is defined for a limited range of x, eqn (5.8) is still valid since p(x) is taken 
to be zero outside this range. 


134 probability theory 


The probability of drawing, 2 black balls and 2 red (i.e. of x = 2) is 


(9) @)/(2) = 2m 


The probability of drawing, 1 black ball and 3 red (i.e. of x = 3) is 


(7) G)/(4) = 208 
LLNS 4 
Note that, as expected, the sum of these probabilities is 1. The mean 
number of red balls drawn is 
E(x) = 0 x 42/143 + 1 x 72/143 + 2 x 27/143 + 3 x 2/143 
= 132/143 
(C) In an indefinite series of independent trials, there is a constant proba- 


bility p of success. Show that the expectation of the number of failures 
preceding the first success is pt — 1. 


Solution Let q = (1 — p) be the constant probability of a failure and let x 
be the number of failures preceding the first success. Therefore 
gp = the probability of 1 failure followed by the first success 
q’p = the probability of 2 failures followed by the first success 
q*p = the probability of 3 failures followed by the first success and so on. 
E(x) =1x qp+2xq’p+3 x qip +++: =4pll + 2g + 3q* + °°") 
= gp(l — 9)* = 4p x p* =qip = (1 — p)ip =p? — 1 


(D) x is a continuous variable whose probability density function is 


P(x) =k(6—x—x) (-3<x< 2) 
=0 elsewhere 
k is a constant. What is the expected value of x? 


Solution The constant k must be chosen so that the total probability over 
the range of x (—3 < x < 2) is unity, i.e. 


i p(x) dx = aK —x—x')dx =(125/)k=1 k= 6/125 
ar —3 


E(x) = [xc dx = =;  x(6 ae ee 


probability theory 135 


EXERCISES 5.2 
1. An unbiased die is thrown. What is the expected value of the number thrown? 
[33] 
2. 2 balls are drawn, without replacement, from an urn containing 3 black and 4 
white balls. The drawer will receive £2.10 for each black ball drawn and £1.40 
for each white. What is his expectation? [£3.40] 


n 
3. Given that the probability of x successes in n trials is pa * (~~ =0,1, 
x 


2,...,m) where p is the probability of a success and g the probability of a 
failure in any trial. Show that the mean value of x is mp (binomial distribution) 


4. The probability density function of a continuous variable x is given by 


p(x) = ce*® O< x< &) 
= 0 elsewhere 


c is a constant. Find the expected value of x. [ce = 1/a, expected value = a] 


5. Achord is drawn, in a random direction, from a point on the circumference of 
a circle of radius a. Find the mean value of the length of the chord. [4a/7] 


5.7 The Expectation of Functions of a Random Variable 


The concept of expectation may be generalized. Let g(x) be an arbitrary 
function of the random variable x, then the expectation of g(x) is defined 
to be 


E[g(x)] = pa p,2g(x,) for a discrete variable (5.9) 


+0 
-{ p(x)g(x) dx for a continuous variable (5.10) 


It follows easily that 


Elgi(x) + 82(x)] = Elgi(x)] + Elg2)] (5.11) 


DEFINITION 5.5 
The second moment of the probability distribution of a random variable x 
about the origin (x = 0) is defined to be 


Ha’ = E(x”) = > p,x? for a discrete variable (5.12) 
i 


+00 
-{ p(x)x? dx for a continuous variable (5.13) 
2 


136 probability theory 


A more important statistical parameter, the variance pu, of the distribution, 
is the second moment with respect to the mean yu and is defined to be 


fg = E(x — p*® = Dy (x; — »)*p,; for a discrete variable (5.14) 


+00 
ai | (x — m)*p(x) dx for a continuous variable 


(5.15) 
He = E(x — pw)? = EQ? — 2ux + py?) 
= E(x*) — 2uE(x) + #* 
= E(x*) — {E(x)}*_ since uw = E(x) (5.16) 
= py — (5.17) 


DEFINITION 5.6 
The standard deviation o of a probability distribution is defined by 


o° = Me 


EXAMPLE 5.7 


(A) Show that standard deviation of x in Example 5.6(B) is 104/143 approxi- 
mately. 


Solution 


Mal = E(x*) = 0 x 42/143 + 12 x 72/143 + 22 x 27/143 + 3% x 2/143 
= 198/143 


Variance = fl, = Me — w? = 198/143 — (132/143)? = 10 890/(143)? 
Standard deviation o = \/g = 104/143 approx. 


(B) A discrete variable takes the value x with a probability e~"u?/x! (x = 
0,1,2,...). Verify that « is the value of both the mean and variance of the 
distribution (Poisson distribution). 


Solution 


sin 
Mean = 8G) =SerE xe "4 =e ‘ue*=p 
z=1(x—1)! 


probability theory 137 


Second moment about the origin is 


C x oo 7 eS Spt 
a! =SetH x? => {x(x —i+ xjor = > x(x — po +h 
amo x! «2=0 x! 2=0 x! 


«o 2—2 
as ey a += e "ue" +4 =p anne 
a=2 (x = 2)! 


Variance = fg = fg — w= 


EXERCISES 5.3 

1. With reference to Exercise 5.2, no. 1, show that the standard deviation of the 
probability distribution of throws of an unbiased die is 1.71 approximately. 

2. Show the variance of the probability distribution of Exercises 5.2, no. 4 is a’. 

3. Show that for the binomial distribution (see Exercises 5.2, no. 3) the variance 
is npq. 

4. A chord is drawn in a random direction from a point on the circumference of 
a circle of radius a. Find the mean and the variance of the length of the chord. 
[4a/7, 2a°(1 — 8/7] 

5. A point P is selected at random on a line AB of length 2a. Show that the expected 
value of the area AP x PB is 2a/3 and that the probability that the area will 
exceed a*/2 is 1/,/2. 

6. A chord is drawn parallel to a given diameter of a circle of radius a, its distance 
from the centre of the circle being chosen at random. Show that the mean and 
variance of the length of the chord are za/2 and a®(32 — 3x*)/12 respectively. 
Show that the probability that the length of the chord will exceed a,/3 is 1/2. 


5.8 Joint Probability Density Functions 


So far, probability density functions for one random variable only have 
been considered. However, many problems arise which involve two or 
more random variables. For simplicity, consider first a case involving 
two continuous random variables, x and y only. 

If a continuous function p(x, y) exists such that the probability of the 
simultaneous occurrence of particular values of x and y is p(x, y), then 
p(x, y) is said to be the joint probability density function of the two random 
variables x and y. 

The probability that x falls in the interval dx and that y falls simultaneously 
in the interval dy is therefore p(x, y) dx dy and it follows that 


[foc y)dx dy =1 (5.18) 


138 probability theory 


If x and y are discrete variables, the corresponding result is 


where p,; denotes the probability that x and y take simultaneously the values 
X;, ; respectively. 


DEFINITION 5.7 

Two random variables are said to be independent when the probability that 
x will assume a particular value is independent of the value of y, and con- 
versely. 

Two such variables are said to be independently distributed and their 
probability density functions (p.d.f.) will be functions of x only and y only 
respectively. Let p,(x) and p,(y) denote the respective probability density 
functions, then 


P(x, y) = pilx)pa(y) (5.20) 


The generalization of the above definition for several independent variables 
is obvious. If p,(x,) are the p.d.f.s of m independently distributed random 


variables x; (i= 1,2,...,m) then their joint p.d.f. p(x,, x2,...,%,) is 
given by 
P(%1, Xa, +++ Xn) = If Pi{x;) (5.21) 


It follows that, if d(x) and y(y) are arbitrary functions of independent 
random variables x and y, the expected value of 4(x)y(y) is given by 


Ede) y(y)] = i | 4(x)y(y)p(x, ») dx dy 


i { ieee | v(y)pa(y) dy 


= Eld(xJE[y)] (5.22) 


Generalizing, if x; (i= 1,2,...,m) are n independent random variables 


| TI eds) | = I Ele) (5.23) 


probability theory 139 


5.9 Probability Generating Functions 


DEFINITION 5.8 


The probability generating function G(t) of a random variable x, discrete or 
continuous, is defined by 


G(t) = E(t*) (5.24) 


When x is a discrete variable, taking the values x; with probabilities p; 
MP2s oc. 5 TBs 


Gi) = > pt™ (5.25) 
i=1 
Assuming that the series in eqn (5.25) may be summed to determine the 
function G(t), then, by definition, it follows that the probability that x takes 
the value x; is the coefficient of t* in the expansion of G(t) in powers of tf. 
Thus G(t) may be used to generate the probabilities of the distribution of x 
and is called the probability generating function (p.g.f.) of x. 
If x is a continuous variable 


G(t) =["F9 dx (5.26) 


The mean and the variance of a probability distribution may be conveniently 
derived from G(t). For the discrete variable, put t = 1 in (5.25) and we have 


G(1) => pet (5.27) 


Differentiating (5.25) with respect to t, we have 


Gi(t) = > xt (5.28) 
G'(1) => Pike = (5.29) 


Multiplying (5.28) by t, we have 


tG'(t) = > px,t™ (5.30) 
t=1 


140 probability theory 


and differentiating (5.30) with respect to t, 


G'(t) + tG"(t) -3 par 
so that 

G'(1) + OU) = 3 pix? = EG) = pa! (5.31) 
Since wu and ya,’ are now known, the variance jz is given by 


He = 0? = ps’ — we? 


5.10 The Probability Generating Function of the Sum of 
Independent Random Variables 


If x and y are a pair of independent random variables 
E(t?+¥) = E(t*t") = E(t*)E(t") 


Let the p.g.f.s of the variables x, y and their sum be respectively G,(t), 
G,(t) and G,,,(t) so that 


Goy(t) = E(t**”) = G,(1)G,(1) (5.32) 


Thus the p.g.f. of the sum of two independent random variables equals the 
product of the p.g.f.s of the individual variables. 
It follows that, if z = > x, (i= 1, 2,...,m) where the x, are independent 
t=1L 


random variables, then 


G,(t) = II G,,(t) (5.33) 
t=1 
EXAMPLE 5.8 


(A) The probability density function of a discrete variable x is (1/2)* where 
x takes discrete values 1, 2, 3,... but is zero for all other values of x. 
Obtain the p.g.f. and hence calculate the mean and the variance of the 
distribution. 


probability theory 141 


Solution 
G(t) = E(*) = Sa/D%"? =F 0/2* = 12-1) 
o=l w= 


Therefore G’(t)=2(2—1)? and G’(t)=4(2—1t)* 
Mean, » = G'(1) = 2 
fe = G'(1) + G"(1) =2+4=6 
Variance, fe = fe — Ww? = 6 — 2? =2 
(B) The probability density function of a discrete variable x is given by 
PO) = 6 Fax aes =.0..1, 2,:...,.) 
(Poisson distribution: see Example 5.7(B).) 
Find the p.g.f. of the variable and deduce that the mean and the variance 


of the distribution are both yu. 


Solution 
ao x 
Gi) => «* Ey 
ono el 


cere . Zl) — a~Hert — u(t—1) 
> (ut) |x! e e 
Therefore 
Git) = pe" and Gt) = pet 
G(l)=p and G()=p? 


Thus, Mean = G’(1) = uw 
Me = G(I)+G)=n+ 


Variance fz = o? = pe’ — 2? = ps 


(C) An unbiased coin is tossed 5 times. Each head scores 3 and each tail 
scores —2. What is the probability of scoring more than 5? 


Solution Consider this type of problem more generally. Suppose p;, gi be the 
respective probabilities of success and failure in the ith trial of a series of n 


142 probability theory 


independent trials. Let the scores for each success and each failure be respec- 
tively «, B. 

In the first trial, p, is the probability that the score x, = « and q, that 
x, = B, so that the p.g.f. = (p,t* + 9,t°). For n independent trials, it follows 
by eqn (5.33) that the p.g.f. of the total score z = x, is 


7 


II (p,t* + q,t*) 


If the probabilities p; and q; are constant (i.e. p and q) from trial to trial, 
the p.g.f. of the total score in n independent trials is 


(pt* + qt*)" 


The only possible scores are equal to the powers of t which occur in the 
expansion of (pt* +- qt*)". 
In the above example, the p.g.f. of the total score is 


preset me jibe 5 
t+ 3r°P = a (t+ 1) 
t7° 
ae {0 + 52° + 102" + 1027? + 5t5 + 1} 


= - {t+ 52° + 1005 + 10 + 5t-5 +4 £7} 


The only possible scores are 15, 10, 5, 0, —5, —10. The probability of 
scoring more than 5, i.e. of scoring 10 or 15, is 


1 
ax(l + 5) = 3/16 


EXERCISES 5.4 

1. With reference to Exercises 5.2, no. 3, show that the p.g.f. for the binomial 
distribution for n independent trials is (pt + q)". Deduce that the mean and the 
variance of the distribution are np and npg respectively. 

2. A coin is tossed 4 times. Each head scores 2 and each tail —1. What is the 
probability of scoring 2 or more? [11/16] 

3. An unbiased coin is thrown 5 times. Each time a head is thrown, 2 is scored, and 
each time a tail appears, 1 is subtracted from the score. Show that the p.g.f. of 
the score is (¢* + 1)*/(2t)>. Hence show that the probability of scoring 1 is 5/16 
and of scoring 1 or more is 13/16. 


probability theory 143 


4. A die is rolled 5 times. It is agreed that a throw of 5 or 6 will score 1 and that 
any other throw will score —1. What is the probability of scoring 1 or more? 
[17/81] 

5. Show that the p.g.f. for the score of an unbiased die is (1 — t)/6(1 — t). Such 
a die is rolled 5 times; show that the probability of a score of 15 is 0.0837 
approximately. 


6. If all points of the x-axis between x = 0 and x = a are equally probable, show 
that the p.g.f. of the distribution of x is (t* — 1)/a log, t. 


5.11 Binomial and Multinomial Theorems 


The expansion of the binomial expression (x + y)" can easily be obtained 
by a simple combinatorial method. Consider the expression in the form of a 
product 


(x + y)\(x+y)°** (x+y) ton factors 


The general term of the expansion is Cx"~"y". We wish to find the coefficient C. 

The term x”~‘y’ arises by selecting x from (n — r) of the factors and y from 
the r remaining factors. The coefficient C of the general term is the number 
of ways in which the term x”~"y’ can arise. C is therefore the number of ways 
in which n factors may be divided into two groups such that one group 
contains (n — r) factors and the other group contains r factors. Therefore 


-() 


Hence 


(x ov y)” = x" a ("ey + pee + eh ar y” 


= (") xmryr (5.34) 


By a simple extension of this method, the expansion of the multinomial 
expression (x, + X2 + X3 +°** + x;,)”" may be obtained. 
The general term of the expansion of this expression is 


k 
Cx where n = in, 
t=1 


The above term arises by selecting x, from n, of the n factors, x, from n, of 
the factors, x, from ng of the factors, and so on. The number of ways in which 


144 probability theory 


a term of this type will arise is equal to the number of ways in which the n 
factors may be divided into k groups of 1, nz, m3, .. . , m, factors. Thus, by 
eqn (5.6), 


C= n}/n,! ny! ns! at n,! 
Therefore 
n! 
(x + Xo of. X3 + baat + x;)" = ) Scene easyregemmne Xp Xs Xs” dhe x 
Ny! Ng! ng! > ++ nz! 


k 
subject ton = z n, and each n;, takes integral values from 0 to 7. 
=1 


5.12 Independent Trials with Two Outcomes 


Consider a series of repetitive independent experiments for which the outcome 
of any one experiment is either the occurrence of a certain event or its non- 
occurrence. The tossing of a coin, for which the outcome is either a head or 
no head, is an example of such an experiment. 

Let p be the probability of the occurrence of such an event so that (1 — p) = 
q is the probability of its non-occurrence. The occurrence of the event will be 
called a success and the non-occurrence a failure. Thus the probabilities of a 
success and a failure are respectively p and q. 

Suppose that independent trials are made. The probability of x successes, 
which are necessarily associated with (n — x) failures, in any one particular 
order is p*g"~. - 

But x successes and (n — x) failures may occur in 2 different orders in n 


trials and each permutation occurs with a probability p*g"-*. Since these 
occurrences are mutually exclusive it follows that the total probability of x 
successes in 7 trials is given by 


a) ie peer wore) 6 2 n—x 
fe) = (") pra = So ora (5.36) 


(x = 0,1,2,...,n) 


f(x) is known as the Binomial or Bernouilli* Probability Function. 
It should be noted that the probabilities in (5.36) are generated as terms of 


* Bernouilli was a pioneer in applying probability theory to discrete variables. 


probability theory 145 


the binomial expansion of 
n n n\ nl 
Gr Prea+ ("a P 
oe ("\avr" sie arene +- (")ap" + ee + Dp 


a yf (x) (5.37) 


a=0 


Note that, as expected, > f(x) = 1 since (p + q) = 1. 
c=0 


Suppose that N samples, each of size n, are drawn from a large statistical 
population. Since, in the long run, probabilities may be taken to be relative 
frequencies, it follows that, if N is large enough, the frequencies of 0, 1, 
2,...,m successes, occurring in N samples, will be the successive terms of 
the expansion of N(q + p)”. 


5.13 The Probability Generating Function 
of the Binomial Distribution 


From egns (5.25) and (5.36), the p.g.f. of the binomial distribution is given by 


ow =3 (")orare = 3 (")wnra = a + vo (5.38) 


«=0 \X nn 
G'(t) = np(q + pt)? and Gt) =n — pq + pr) 
G'(1) =np and G"(1) = n(n — 1)p? 


Therefore, by (5.29), Mean, u = G’(1) = np (5.39) 
and by (5.31), #e’ = G’(1) + G”"(1) = mp{l + (nm — 1p} 


Variance, fl, = fl,’ — p? 
= np{l + (n — 1)p — np} = np(1 — p) = npq (5.40) 


Standard deviation, o = ,/(npq) (5.41) 


146 probability theory 


EXAMPLE 5.9 
Expand (x + 2y — 3z)°. 


Solution Consider first the expansion of (x + y + z)*. The expansion may 
be expressed as 


3! 


ae xmynagns 
M1,n2.N3 n,! ng! ns! 


where 1, + n, + ng = 3. 

The following is a list of all possible partitions of 3: (3,0, 0) (2, 1, 0) 
(2,0,1) (1,1, 1) (1,0,2) (1,2,0) (0,2,1) (,1,2) (0,3,0) (0,0, 3). 
Therefore 

(xt y+ z= x8 4+ y® 4 22 + 3x%y + 3xy? + 3y%2z + 3yz? 
+ 3x*z + 3xz* + 6xyz 
(x + 2y — 3z)® = x* + 8y® — 272° + 6x*y + 12xy? — 36y*z 
+ 54yz* — 9x*z + 27xz* — 36xyz 
EXAMPLE 5.10 


10 dice are thrown and a throw of 5 or 6 is a “‘success”. Find the probability 
of (a) 3 successes, (b) 3 successes at most, (c) 3 successes or more. 


Solution The probability of a success = probability of throwing 5 or 6 


=p = 1/3 
Therefore g = 2/3. 


The probability of x successes in a throw of 10 dice (or in 10 throws of one 
die) is 


(eer (GG 


3 7 
(a) The probability of 3 successes is (3) (5) (3) = 0.260 approx. 


(6) The probability of 0, 1 and 2 successes is 


() + OG C)G6) 


= 0.0173 + 0.0867 + 0.1951 = 0.299 approx. 


probability theory 147 


Thus, probability of 3 successes at most is 
0.260 + 0.299 = 0.559 approx. 
(c) The probability of 3 successes or more is 


1 — (probability of 0, 1 and 2 successes) = 1 — 0.299 = 0.701 approx. 


EXAMPLE 5.11 

In a precision bombing attack, there is a 50% chance that any one bomb 
will strike the target. Two direct hits are required to destroy the target 
completely. How many bombs must be dropped to give approximately 99% 
chance of destroying the target? 


Solution The probability that a bomb will strike the target is p = 1/2. Thus 
q = 1/2. Let n bombs be dropped. Two or more hits will destroy the target. 
The probability of 2 or more hits from n bombs is 


1 — (the sum of the probabilities of 0 and 1 hits from m bombs) 


{OC QVGM 1 mr 


The minimum number of bombs is given by the least value of n such that 
1 — (# + 1)(1/2)* > 0.99 
or (n + 1)(1/2)" < 0.01 
For n = 11, 12(1/2)" = 0.0058 < 0.01 
For m = 12, 13(1/2)'* = 0.0318 > 0.01 
Therefore, the least number of bombs required is 11. 
EXAMPLE 5.12 
100 samples, each containing 20 components, produced by a machine are 
tested for defectives. The following table gives the frequency distribution of 


samples containing 0, 1, 2, . . . defectives. 


No. of defectives persample} 0 1 2 3 4 5 6 7ormore 


No. of Samples 21 35 25 144410 0 


148 probability theory 


Show that the mean number of defectives per sample is 1.48 and hence that 
an estimate of p, the probable proportion of defectives in the population, is 
0.074. Assuming a binomial distribution, use this value of p to calculate 
the probable number of samples containing 0, 1, 2,..., 6 defectives. 


Solution 


Mean no. of defectives = 73y(21 X 0+ 1 x 35+2 x 25+3 x 14 
per comple +4x445x146x04+7x0) 
= 148/100 = 1.48 


Proportion of defectives in the population = p = 3'5(1.48) = 0.074. 
Thus g = 0.926. 
On the assumption of a binomial distribution, the probable number of 


samples containing x defectives is 100 ~ iat peek 


Thus the probable number of samples containing 0, 1, 2, . . . defectives is 


100{q°, 20pq?®, 190p7q"®, . . .} 
= 100{(0.926)", 20(0.074)(0.926)"®, 190(0.074)°(0.926)'8, . . } 
=.21 34, 26512..4 51. 0 


5.14 Combinatorial Generating Functions 


The calculation of the number of outcomes favourable to a certain event 
is often difficult. In Section 5.9, the concept of probability generating 
functions was introduced and some applications to problems of enumeration 
given. In this section, some examples of the application of other generating 
functions to problems of enumeration associated with the multinomial 
theorem (eqn 5.35) will be given. 

Consider the problem of Example 5.5 (p. 131) concerning the throwing 
of 6 balls into 4 boxes. It will now be shown that the multinomial expansion 


. 


(xy +x +xst+x)°= > S Snare (5.42) 
mane [1 7,! 


i=1 


(nm, + my + ng + ny = 6) 


may be interpreted in terms of the above problem. It will be noted that the 


probability theory 149 


coefficient of the general term 


S Ni lXar Xa Xe (5.43) 


IL ”:! 


i=1 


gives the number of ways in which 6 balls may be partitioned so that m, no, 
ng and ny, balls fall into 4 boxes respectively. On putting x; = x, = x3 = X= 
1 in (5.42), it follows that the total number of possible outcomes is 


! 
ane 1 7,! 
i=1 
From eqn (5.43), 6!/! mg! ng! 2! gives the number of outcomes for which 2 
balls fall into the 4th box whilst m,, 72, n3 balls (m, + nz + ny = 4) fall 
respectively into the other 3 boxes. Therefore the number of outcomes for 
which 2 balls fall into the 4th box is 


6! 4! 
2! 4! ni,n2.ng (n,! n,! ns!) 


= (Sen + Xe + Xe) = (5) 3 (x, =X,=x3;=> 1) 
Required probability = (;) 36 / 4° (as before) 


The expression (x; -+ Xz + X3 + X4)® provides a simple generating function 
for the solution of many similar problems concerning this particular physical 
system. This g.f. is of little value for the problem considered since it can be 
solved more simply otherwise. Nevertheless, the treatment is illustrative of 
the application of generating functions to much more complex problems when 
the enumeration of outcomes is otherwise difficult. 


We now introduce another type of generating function by applying it to the 
following problem. An urn contains 3 black balls and 5 white. 3 balls are 
successively drawn at random from the urn and are placed in a black box. 
The remaining 5 balls are placed in a white box. Find the probability that 
the sum of the number of black balls in the black box and the number of 
white balls in the white box is 4. 


150 probability theory 


This problem may be solved easily without a generating function. The 
total number of ways in which 3 balls may be drawn from the urn is 


The only way in which 4 may be obtained as the sum of the number of black 
balls in the black box and of white balls in the white box is when there are 3 
white balls in the white box and 1 black ball in the black box. The black box 
will then contain 1 black ball and 2 white. The number of ways of filling the 
black box will therefore be (;) (;) which is the number of outcomes 
favourable to the total of 4. Therefore 


Required probability = (;) (;) / (3) = 15/28 


Now consider the generating function 
(x,t + X2)°(x1 + xt)? (3.44) 


where, similarly to the last example, the variables x,, x, are associated with 
the two boxes: x, with the black and x, with the white. In eqn (5.44), the 
first factor is concerned with the partitioning of the 3 black balls between 
the two boxes and the second factor with the partitioning of the 5 white balls 
between them. 

In this problem, we are interested only in those outcomes for which 3 balls 
are in the black box and 5 are in the white box so that we are concerned with 
the enumerations which occur as coefficients of x,3x,° in the expansion of 
(5.44). Since ¢ is associated with x, in the first factor and with x, in the 
second factor of (5.44), the coefficient of x,°x,°t” in the expansion will give the 
number of ways in which a total of r balls with colours matching the boxes 
occur amongst all the possible outcomes for which 3 balls are in the black 
box and 5 balls in the white box. 

Putting t = 1 in (5.44), the g.f. becomes (x, + x2)* from which the total 
number of possible outcomes is 


Coefficient of x,*x,° in the expansion of (x, + x2)*® = (3) as before 


Equation (5.44) may be written in the form 


[EQJour-ff2() err] 


n=0 


probability theory 151 


so that the general term of the product is 


(°) (°) Fer near eect Aa geitad 
m/ \n 


The term containing x,°x,°¢* is that for which m = 2, n = 3 and is 


Cre 


so that the number of outcomes favourable to the sum of matching colours 
equalling 4 is 


C0) = Giga 


The generating function method has again been illustrated by application 
to a problem which can be solved much more easily otherwise. Nevertheless, 
such a problem would become much more complicated by the introduction 
of even one more colour and the g.f. method would be quite valuable. 

Consider the following problem: an urn contains n balls, , being black, 
ng being white and n, red. If m, balls are drawn and are placed in a black box, 
m, are drawn and placed in a white box, and the remaining mz are placed in 
a red box, find the probability that a total of r balls have the colours of the 
boxes in which they are placed. 

The appropriate generating function is 


(xyt + x + x5)"(X1 + Xat + X5)"*(%1 + X2 + Xt)” 


when n = ny, + mo + Ng = my + M2 + Ms. 

The number of outcomes favourable to the desired event is the coefficient 
of xy™x_"2x,"2t". It is much more difficult to find the number of favourable 
outcomes without the use of the above g.f. 


EXERCISES 5.5 
1. (a) Find the coefficient of x*yz? in the expansion of (2x — 3y + 52)’. 
(b) Find the expansion of (@ — 2b — 3c). 
[(@) —36 000 
(b) There are 15 terms. The expansion is a* — 8a°5 — 12a°c + 24ab? + 
54a2c2 + 72a2bc — 32ab® — 108ac* — 216abc? — 144ab*c + 16b* + 96b8c + 
216bc? + 216b%c? + 81c*] 


152 probability theory 


2. A die is thrown 10 times. Show that the probability that a six appears at least 
twice is 0.516 approximately. 


3. In a factory, 10% of the output of a certain component is defective. 


(a) What is the probability that two or more components will be defective in a 
random sample of 10? 


(6) What is the largest sample size which will be at least 50° certain to contain 
no defectives. [(a) 0.265 approx., (6) 6] 


4. What is the probability that there are not more than 3 boys in a family of 8 
children? (Assume that the probability of a male or female birth is}.) [93/256] 


5. A machine produces parts of which 1 % are defective. What is the smallest sample 
size which should be inspected in order that the probability that there will be no 
defectives in the sample is less than 0.05? [296] 


6. On the average, hens lay eggs 4 days per week. On how many days during a 
season of 200 days may a poultry keeper expect to obtain 6 eggs from 8 hens? 
If he obtained 6 eggs or more on 42 days during the season, would this suggest 
that factors other than chance were operating? [36 approx.; yes] 


7. If f (x) is the probability of x successes in a binomial distribution [see eqn (5.36)], 
show that 


fee += (4) reo 


8. 6 balls are thrown into 3 boxes and each ball is equally likely to fall into either 
box. 


(a) Show that the probability that 3 balls fall into the second box is 160/729. 
(6) Show that the probability that each box will contain at least one ball is 20/27. 
9. An urn contains 11 balls of which 6 are black and 5 are white. 4 balls are drawn 
at random from the urn and are placed in a black box. The remaining 7 balls 


are placed in a white box. Using a generating function, verify that the probability 
that a total of 5 balls match the colours of the boxes containing them is 


(2) (2)/() 
5.15 Compound Probability 


Many applications of probability are concerned with the joint occurrence 
of two events A and B associated with an experiment. 

In the Venn diagrams of Fig. 5.5, let the rectangles denote the sample 
space for the experiment and let the sets 4 and B comprise those sample 
points associated with the occurrence of the events A and B respectively. 

The set A U B of sample points (Fig. 5.5(c)) comprises all those sample 
points which are associated with the occurrence of the events A or B or 
both A and B. It follows that the probability of the occurrence of the event 


probability theory 153 


Figure 5.5 

(a) Set A of sample points associated with event A 
(b) Set B of sample points associated with event B 
(c) SettAUB (d)SetAQNB 


A or the event B or both, denoted by p(A U B), will be the sum of the 
probabilities associated with all the sample points in the set A U B. 
Similarly the set A A B (Fig. 5.5(d)), comprising the sample points which 
are common to the sets A and B, contains those sample points which are 
associated with the simultaneous occurrence of the events A and B. The 
probability of the simultaneous occurrence of A and B will be denoted by 


P(A OB). 
Since AU B= A+ B— (A 2 B) it follows that 
p(A U B) = p(A) + p(B) — p(4 1 B) (5.45) 


If the two events A and B are mutually exclusive, the occurrence of one 
event precludes the occurrence of the other which implies that the sets 4 and 
B have no sample points in common, i.e. A AN B= ©. 

Therefore, for two mutually exclusive events A and B, 


p(A U B) = p(A) + p(B) (5.46) 


It may easily be verified, from a Venn diagram for three sets of sample 
points associated with events A, B and C, that the probability of A or B 
or C is 


p(A UB UC) = P(A) + p(B) + p(C) — p(A 9 B) — p(B NC) 
—p(C NA)+p(A NBC) (5.47) 


154 probability theory 


In general, for m events Ay, Ao,..., Aq, it may be shown that 


n 
P(AyU AZ U's U A,) = > p(A) — > (A; O A;) 
i=1 i, 
+ > 0A,NA,NA,) 
‘hs 
—-+++(—1)""p(A, NA, A+++ OA,) (5.48) 


If the n events are mutually exclusive, all terms on the right-hand side of 
(5.48) except the first, are zero. Therefore 


AAU A,Us+ UA = SA) (5.49) 
i=1 


for n mutually exclusive events. 


5.16 Conditional Probability 


Suppose that an event A has occurred and that we now wish to find the 
probability that an event B will occur subject to this condition. This is a 
conditional probability. 

With reference to Fig. 5.5, since the event A has occurred, we are now 
concerned only with the set A of sample points associated with that event. 
Thus the set A now comprises the whole sample space from which the 
conditional probability of the event B is to be determined. Since the total 
probability residing in any complete sample space must be 1, it will be 
necessary to multiply the probabilities p; originally assigned to the sample 
points of the set A by a factor k so that the sum of the new probabilities kp; 
of the sample points of set A shall be 1. 

Thus k is determined by the condition 


i kp; = 1 
‘4 
Therefore 


k=1 / p3 P; = 1/p(A) provided p(A) 4 0 


The sample points of the set A will therefore be assigned probabilities 
pilp(A). 


probability theory 155 


Let p(B/A) denote the conditional probability of the event B, i.e. the 
probability that B will occur when A has already occurred. The measure of 
p(B/A) will thus be the sum of the probabilities p,/p(A) assigned to those 
sample points associated with the occurrence of the event B which are in the 
set A. Therefore 


BIA Aa ee a 5.50 
MB/A) = > Pl PA) = eos 7A) (5.50) 
P(A 1 B) = p(A)p(B/A) (5.51) 

By interchanging the order of the events, we have 
p(A  B) = p(B)p(A/B) (5.52) 


Since p(A /M B) is the probability of the simultaneous occurrence of events A 
and B, eqn (5.51) is the formula by which this probability may be determined 
when the events A and B are not independent. The formula (5.51) may easily 
be generalized for several independent events. For three such events A, B 
and C, we have 
P(A NBOC)=p(A N B)p(C/A N B) 
= p(A)p(B/A)p(C/A 0 B) (5.53) 
and similar results may be obtained by permuting the symbols A, B and C. 
By induction, it easily follows that, for n dependent events A,, Ao, As,..., 
An» 
P(Ay 1 Ag? + O Ax) = p(Ar)p(Ao/Ay)p(Ag/A1 OV As) 
x P(Ag/ Ax ‘ay As la) As) mh 
x P(A, /Ax (a) Ay VAS As) (5.54) 


Clearly n! similar equations result from permutations of the symbols A,, 
Aa, sey Ai. 

Now suppose that an event A is independent of an event B in the sense 
that the probability of the occurrence of the event B has the same value 
whether A occurs or does not, so that 


p(B/A) = p(B) 
With reference to (5.51), it follows that, for two independent events, 


P(A 1 B) = p(A)p(B) (5.55) 


156 probability theory 


By comparison with (5.52), it follows that 


P(A/B) = p(A) 


which implies that the event A is also independent of the event B. Thus if B 
is independent of A, A is also independent of B. 
From (5.55), it follows that, for three independent events A, B and C, 


P(A 1 BOC) = p(A 2 B)p(C) = p(A)p(B)p(C) (5.56) 


For n independent events, we have the obvious generalization 


WA, NA, N***A) = II 7(A,) (5.57) 


EXAMPLE 5.13 

An urn contains 3 white and 7 black balls. What is the probability that (a) 
one ball drawn is white, (b) two drawn are both white, (c) two drawn will 
be of the same colour, (d) two drawn will be of different colours, (e) two 
drawn will be 1 white and then | black? (The balls are drawn successively 
without replacement.) 


Solution There are altogether 10 balls. 
(a) The probability of drawing 1 white ball (event A) is 3/10, p(A) = 3/10. 
(b) After drawing | white, 2 white and 7 black balls remain. 

Thus the probability of drawing a second white (event B) is 2/9. 

This is a conditional probability expressed by p(B/A) = 2/9. 

Therefore the probability of drawing two whites is 


p(A A B) = pA)P(B/A) = = x 5 = 


This result may be obtained otherwise. The number of ways of selecting 2 


10 
balls from 10 is ( :) and the number of ways of selecting 2 white balls from 


suQ) 


3 10 1 
Thus, the probability of selecting two white balls = () / ( ) = Ts ° 
(c) Similarly the probability of drawing two black balls is 


Z x 2 = a or (alternatively) (’) (")) 
tO! 9 15 2 2 


probability theory 157 


Since the drawing of two whites and of two blacks are mutually exclusive 
events, the probability of drawing either two whites or two blacks is 


1 7 8 
i” is” 15 
(d) The drawing of two balls of different colours is the alternative to (c). 
The probability of drawing two balls of different colours is 1 — 7% = yy. 


: : i; 1 10 7 
Alternatively, the required probability is AAY a) iS: 


(e) The probability of drawing firstly a white ball is 3/10. 
The conditional probability of then drawing a black is 7/9. 


he ae 


3 
Required probability = 75 x 5 = 39 


As expected, this probability is half that obtained in (d). 


EXAMPLE 5.14 

What is the probability of throwing a sum of 9 and/or a difference of 3 
in a single throw of two dice? If a difference of 3 is thrown, what is the 
probability that the sum is 9? 


Solution Let the event A be a throw of total 9 and the event B be a throw of 
difference 3. 
4 outcomes (6, 3) (5, 4) (4, 5) (3, 6) comprise event A. Thus p(A) = 4/36. 
6 outcomes (6, 3) (5, 2) (4, 1) (1, 4) (2, 5) (3, 6) comprise event B. Thus 
p(B) = 6/36. 
Events A and B are not mutually exclusive since A and B have common 
elements (6, 3) and (3, 6) so that 


p(A A B) = 2/36 
The required probability of A or B or both is 


ee he ee 
p(A U B) = pA) + p(B)— p(4 NB)=3+ 3 — He =5 


This result also follows from first principles by noting that there are 8 
different outcomes only giving a sum of 9 and/or a difference of 3. 
Since p(A \ B) ¥ p(A)p(B), A and B are not independent, and 


158 probability theory 


This result again follows easily from first principles. Note that when the 
event B has occurred, there are 6 sample points only to be considered when 
calculating p(A/B). Only 2 of these 6, namely (6, 3) and (3, 6), lead to the 
event A. Therefore p(A/B) = 2 = 1. 


EXAMPLE 5.15 

Six cards are drawn from a pack. What is the probability that exactly two 
aces and two kings will be drawn (a) if the cards are drawn successively, (b) if 
each card is returned to the pack which is shuffled before the next card is 
drawn? 


Solution 
(a) Let the aces be denoted by A, the kings by K, and all other cards by X. 
Firstly, let us calculate the probability that the 6 cards are drawn in the 
particular order A, A, K, K, X, X. 

By eqn (5.54) for n = 6, this probability is given by 


P(ANANKNKNAXAX) 
= p(A)p(A]A)p(K/A 1 A)*+* p(X/A NAN KAKO X) 


The probability p(A) of drawing the first ace from a pack of 52 cards is 4/52. 
When one ace has been drawn, the probability p(4/A) of drawing a second 
ace from the remaining 51 cards is 3/51 and so on. Therefore 


KN Te a « obtene 
PANANKONKOXONXY=5xX FX HX HX RXR 


This is the probability of drawing the 6 cards in one specified order. Since, 
however, there are 6!/(2!)*, ie. 90 permutations of the symbols A, A, K, K, 
X, X, the required probability is the sum of the probabilities of all these 
permutations. It is easily seen, however, that the probabilities of each of these 
permutations are equal, e.g. 


o 4 3. 44, 3 383 
PK NANANXNKOX= BX eX HX BX R*H 


=pP(ANANKAKAXNX) 


s he eee ohe, A Alek *.) 
Required probability = 90 x m*3*p*n*R*H 


= 0.0017 approx. 


probability theory 159 


(5) If, after drawing a card, it is returned to the pack which is shuffled before 
the next card is drawn, the 6 constituent events comprising the compound 
event become independent so that 


: re 4/47/44? 
Required probability = 90 (5) (5) (5) = 0.0023 approx. 


EXAMPLE 5.16 
Three persons A, B, C, in that order, successively throw a coin and the first 


person to throw a tail wins. What are the respective probabilities that A, B 
and C will win? 


Solution The probability that A will win on his first throw is }. 
The probability that A, B and C all fail to win on their first throws is (4)°. 
Thus the probability that A will win on his second throw is (4)(4)* and the 
probability that A will win on his third throw is (3)(4)(4)° and so on. 
Therefore the total probability that A will win is 


140+ @+:::=—_.,. =3 


The probabilities that B will win on his first, second, third, ... , throws are 
respectively 


(2)*, (2)°(2)*, (2)? )°(D)*, - - - 


Therefore the total probability that B will win is 


(3)? + @ + @P+-:* =F 
Similarly, the total probability that C will win is 
(2)? + (3)° + GP +°+s =F 


As expected, the sum of these probabilities is 1. 
EXAMPLE 5.17 


Eight cards are drawn from a pack one at a time with replacement. Find the 
probability that these 8 cards will include at least one card from each suit. 


Solution Let the event A be that for which the drawing of 8 cards include at 
least one card from each suit, and let the event B be the alternative for which 
the drawing of 8 cards lacks at least one of the suits. Therefore 


P(A) = 1 — p(B) 


160 probability theory 


To solve this problem, we shall calculate p(B). All outcomes favourable to the 
event B may be divided into subsets B,, By, Bs and B, comprising those sets 
of outcomes which respectively lack hearts, diamonds, clubs and spades. 
Thus, B,, Bz, Bs; and B, are not mutually exclusive since, for example, an 
outcome comprising clubs and spades belongs to both B, and B,. By eqn 
(5.48) 


p(B) = p(B, U B, U B; U By) 


- S p(B,) — > PB; O B;) 


+ ¥ p(B; 0 B; 0 B,) — p(By O By O By O By) 
idk 


Now p(B,) is the probability that no hearts will appear in a drawing of 8 


cards. Therefore p(B,) = (3/4)8. 
Similarly p(B.) = p(Bs) = p(B,) = (3/4)8. Therefore 


x p(B,) = 4(9)° 


p(B, O B,) is the probability that neither hearts nor diamonds appear in a 
drawing of 8 cards. Therefore p(B, O B,) = (1/2)8. 


‘4 
From the four sub-sets B,, there are ( , i.e. 6, pairs which may be selected, 


the probability for each pair being also (1/2)’. Therefore 


> p(B, O B;) = 6(4)* 
3 
Similarly 


> p(B, 0 B, 0 B,) = 4()° 
t.5,k 


and p(B,  B, \ Bs; O By) = 0 since the absence of each suit simultaneously 
is impossible. Therefore 


P(A) = 1 — 4(%* + 6(4)* — 4(2)* = 0.623 approx. 
Now consider the additional problem as follows. Cards are drawn successively, 


with replacement, until all suits appear at least once. Find the probability 
that 8 cards must be drawn. 


probability theory 161 


Let p,, denote the probability that all suits will appear at least once when n 
cards are drawn. Therefore, as above, 


Pn = 1 — 4(9)" + 6(4)" — 4(2)" 


Let p,,’ denote the probability that all suits will first appear at the drawing 
of the nth card. When n cards are drawn, there are probabilities p,’, p,', 
Pe s+++>Pn' that all suits will appear at least once at the 4th, Sth, 6th, ..., mth 
drawing and since these outcomes are mutually exclusive 


Pun=Pa +Ps +pe t+°°*+Pn’ Thus pn’ = Pn — Pra 


In the given example, n = 8. Therefore 


Pe’ = Po — Pr = {1 — 4(8)* + 6(9)* — 4(2)} 
— {1 — 49)" + 64)’ — 4(2)7} = 0.110 


EXERCISES 5.6 
1. A coin is tossed 3 times. Find the probability that (a) the first two tosses include 
a head and a tail, (6) there are more tails than heads. [1/2, 1/2] 


2. In a throw of two dice, what is the probability of either a total of 4 or a total of 
8? What is the probability of a difference of 4 and/or a total of 8? [2/9, 7/36] 


3. 10 persons are seated at a circular table. Find the probability that a particular 
pair of persons are seated next to each other. [2/9] 


4. A die is thrown twice. What is the probability that the sum of the two throws 
exceeds 10 given that (a) one is a 6, (6) the first throw isa 6? 3/11, 1/3] 


5. A set of 19 cards are numbered 1, 2, 3,..., 19. If one card is drawn, what is 
the probability that its number is a multiple of 2 or 3? What is the probability 
that the number will be a multiple of 2 or 3 or both? [9/19, 6/19, 12/19] 


6. A person holds two tickets for a draw in which there are 12 horses and 15 
blanks. Show that the probability of drawing a horse is 82/117. 


7. Each of a pair of dice has two faces numbered 1, two numbered 4 and the 
remaining two numbered 2 and 3. Set up a two dimensional sample space for 
a single throw of the two dice and indicate the sets of points S,, S, and S 
corresponding respectively to the events: (a) a total score of 6, (6) a total score 
of 4, (c) the same score on each die. 
Calculate the probability of (d) a score of 4 or 6, (e) scoring 4 or 6 when both 
dice show the same score. [5/18, 1/18] 


8. A group of 10 men and 6 women includes one married couple. A committee of 
4 men and 3 women is chosen by lottery. What is the probability that both 
the husband and wife will be members? [1/5] 


162 probability theory 


9. An unbiased die, having 6 sides, is thrown 5 times. Show that the probability 
that at least one 6 is thrown is 0.598 approximately. 


10. A bag contains 4 red balls and 3 blue. Two drawings of 2 balls are made. What 
is the probability that 2 red balls and then 2 blue balls are drawn if (a) the balls 
are returned to the bag after the first draw, (b) the balls are not returned? 
[2/49, 3/35] 


11. Ina game of bridge, one player’s hand contains K, Q, J, 10, 9 of spades. What 
is the probability that another player’s hand holds the ace of spades and exactly 
two other spades? [0.115 approx.] 


12. Find the probability of obtaining (a) more than 7 with a throw of 2 dice, (6) at 
least one 6 in a throw of 3 dice. [5/12, 91/216] 


13. From a bag containing 4 white and 5 black balls, 3 are successively drawn at 
random. What are the odds against all 3 being black? [37 to 5] 


14. Two bags contain respectively 3 white and 2 black balls and 5 white and 3 
black balls. If a bag is chosen at random and 1 ball is drawn from it, what is 
the probability that it is white? [49/80] 


15. Four screws are selected from a large batch of screws of which 15% are de- 
fective. Show that the probability is 0.05% that these screws are all defective. 


16. Three groups, each of 5 people, comprise 3 men and 2 women, 2 men and 3 
women and 1 man and 4 women. One person is selected at random from each 
group. Show that probability is 58/125 that the selected group comprises 1 man 
and 2 women. 


17. Two drawings, each of 3 balls, are made from a bag containing 5 white balls 
and 8 black. Find the probability that the first drawing gives 3 white balls and is 
followed by a second drawing of 3 black balls, the balls being (a) replaced, (6) 
not replaced before the second trial. [140/20 449, 7/429] 


18. A bridge player and his partner hold 9 hearts between them. What is the 
probability that their opponents hold 2 hearts each? [234/575] 


19. 3 cards are drawn from a pack. The hand is known to include at least 2 aces. 
What is the probability that it contains 3 aces? If this hand were known to 
include the 2 black aces, what is the probability that it contains 3 aces? 
[1/73, 1/25] 


20. A box contains 23 ball bearings, 8 of size A, 3 of size B and 12 of size C. 3 are 
selected at random. What is the probability that the 3 are (a) of size A, (b) of the 
same size, (c) of different sizes? [8/253, 277/1 771, 288/1 771] 


21. Show that, if 5 cards are drawn successively from a pack, the probability that 
there will be exactly two aces is approximately 0.040. If each card is replaced 
and the pack is shuffled before the next card is drawn, show that the probability 
of drawing exactly two aces is approximately 0.047. 


22. Two persons A and B alternately cut a pack of cards, shuffling after each cut. 
A starts and the first to cut a diamond wins. What are the probabilities of A 
and B winning? [4/7, 3/7] 


probability theory 163 


5.17 Bayes’ Theorem 


This theorem has applications to a particular class of problems involving 
conditional probabilities. The following is a simple typical case for which a 
solution from first principles is given. 

Two boxes contain respectively 3 white and 2 black balls and 2 white and 
4 black balls. One of the boxes is selected, the probability of selecting the 
first box being 3/4. A ball is drawn from the box selected. If this ball is black, 
what is the probability that it came from the first box? 

Let A be the event of selecting the first box and let 4 be the complementary 
event of selecting the second box. Therefore p(A) = 3/4 and p(A) = 1/4. 

Let B be the event of drawing a black ball and let B be the complementary 
event of drawing a white ball. Therefore p(B/A) = 2/5 and p(B/A) = 4/6 = 
2/3. It is required to calculate p(A/B). 

By eqn (5.50), 


p(A/B) = p(A 0 B)/p(B) (5.58) 
But by eqn (5.51), 
p(d 0 B) = p(A)p(B/A) = 3 x # = 3/10 
Similarly 
p(4 2 B) = p(A)p(B/A) = $ x § = 1/6 


The event B will occur only when either of the mutually exclusive events 
A Bor A OB occurs. Therefore 


p(B) = p(A 1 B) + p(A OB) = 3 + | = 28/60 
By eqn (5.58), 


The above example belongs to a class of problems in which the outcome 
of an experiment can result from any one of a number of independent events 
or “causes”. It is required to calculate the probability that this outcome has 
resulted from the occurrence of a particular one of these events. 

Problems of this type may conveniently be solved by applying the concept 
of sample space in which probability measures are represented by areas. 


164 probability theory 


Figure 5.6 


For the example above, let the contours A and B (Fig. 5.6(a)) enclose the 
sample spaces corresponding to the events A and B. It is convenient to 
represent the complete sample space (or universal probability set) by a 
rectangle of unit area (Fig. 5.6(b.)) The rectangles on the bases A, A have 
areas }, } respectively, representing the probabilities of the selection of the 
first and second boxes. The curved line divides the unit rectangle into two 
areas representing p(B) and p(B) which are initially unknown. 

The rectangle is thus divided into four areas representing the probability 
measures for A (1 B, A 0 B, A 1 B, A OB. The probability of the simul- 
taneous occurrence of the events A and B will be the sum of the probabilities 
associated with the sample points in the region A © B. Corresponding 
interpretations may be given to the total probabilities assigned to the re- 
maining regions A) A, Bx Band Aq B. Let the symbols xj, xz, y, and ye 
denote these total probabilities. From Fig. 5.6(b), it follows immediately that 


P(A) = 1 + yi. = 3/4 and p(B/A) = x,/(x; + yy) = 2/5 
P(A) = x2 + y2 = 1/4 and p(B/A) = x2/(x2 + y2) = 2/3 
Therefore 
x1 = p(A)p(B/A) = 3 X 4 = 3/10 
2 = p(A)p(B/A) = x § = 1/6 
Required probability is 
P(A/B) = 1/1 + 2) = io/ (5 + ;) = A 


An extension of the sample space method for n initial or causal events 
Ay, Ag, ..., A, leads to Bayes’ Formula by which problems of this type may 
be solved systematically. 

In Fig. 5.7 the unit rectangle is subdivided by vertical lines into the n 
probability spaces corresponding to the n alternative causal events Aj, 


probability theory 165 


aA, A; A, a, Ae 
Figure 5.7 


A», ..., An. As in Fig. 5.6(b), B and B represent the event which has occurred 
after the experiment has been performed and its alternative. 
From Fig. 5.7, it follows that 


P(A) =X + V1 p(B/A,) = %4/(%1 + Ys) i.e. X, = p(A;)p(B/A;) 


P(A.) = X2+ Yo  P(B/ Az) = X2[(X2 + ye) 1.€. X2 = P(As)p(B/Ag) 
p(A,) =X, +), p(B/A,) = Xl (X, a Yr) i.e. x, = P(A,)p(B/A,) 


P(A, =Xn+ In PCB/An) = Xnl(Xn+ Yn) 1-€. Xn = P(A,)P(B/A,) 


If the outcome is B, the probability that the event A, has occurred is 
A,)p(B/A 
f= PC )p(B/A,) 
X™ > p(A)p(B/A,) 
i=1 


t=1 


(5.59) 


This is Bayes’ Formula. 
In some cases, the initial causal events are equally probable or it may be 


reasonable to assume they are. In such cases, the probabilities p(A;) are 
equal so that Bayes’ Formula simplifies to 


p(A,/B) = ee) (5.60) 


z p(B/A,) 


i=1 


166 probability theory 


EXAMPLE 5.18 

(A) A tennis tournament takes place in June. The probability that a certain 
player will win on a fine day is 0.8 and on a wet day is 0.6. The probability 
of a wet day in June is 0.2. If the player wins a game during the tournament, 
what was the probability that it rained on that day? 


Solution Let A, A be the alternative events of a fine and a wet day. 
p(A)=0.8 and p(4)=0.2 
Let B, B be the alternative events of the player winning and losing. 
p(B/A) =0.8 and p(B/A) = 0.6 
We wish to calculate p(4/B). Using Bayes’ Formula 


p(A)p(B/A) (0.2)(0.6) 


(B) Two towns A and B are connected by four different roads R,, Re, Rs 
and R,. In travelling from A to B by car, the roads are selected with proba- 
bilities 0.2, 0.3, 0.4 and 0.1 respectively. The probabilities that the car can 
travel from A to B in one hour along these roads are respectively 0.6, 0.5, 0.7 
and 0.3. If the car does the journey in one hour, what was the probability 
that the road R, was selected? 


Solution Let Q be the event that the car travels from A to B in one hour. 
Let R, be the event that the car travels along the road R,, etc. 
P(R,) = 0.2, p(Re) = 0.3, p(Rs) = 0.4, p( Ry) = 0.1 
P(Q/Rx) = 0.6, p(Q/R2) = 0.5, p(Q/Rs) = 0.7, p(Q/ Ra) = 0.3 


Required probability 
R, R 
r(R,/O) = A )p(Q/Rx) 
2 PRdPOIR) 


(0.2)(0.6) 


= «6129 
(0.2)(0.6) + (0.3)(0.5) + (0.4)(0.7) + (0.1)(0.3) / 


probability theory 167 


(C) The probability of winning on a certain type of gambling machine is 
0.15. One of four such machines is known to be out of order and the proba- 
bility of winning on this machine is 0.3. One of the machines is selected at 
random. If the player wins, show that the probability that he selected the 
abnormal machine is 2/5. If the player loses, show that the probability that he 
selected the abnormal machine is 14/65. 


Solution Since the machines are selected at random, it may be assumed that 
the probabilities of selecting each machine are equal. Consequently, the 
simplified form of Bayes’ Formula (5.60) may be used. 

Therefore the probability that, if the player wins, he has selected the 
abnormal machine is 


0.3/0.3 + 3(0.15) = 2/5 


and the probability that, if the player loses, he has selected the abnormal 
machine is 


0.7/0.7 + 3(0.85) = 14/65 


EXERCISES 5.7 
1. Two boxes contain respectively 1 black and 2 red balls and 2 black and 3 red 
balls. One of the boxes is selected at random and a ball is drawn from it. What 
is the probability that the first box was selected if the ball drawn was (a) black, 
(b) red? [5/11, 10/19] 


2. In a group of women, 20% are blondes and 80% are brunettes. If 50% of the 
blondes and 40% of the brunettes are married, what is the probability that a 
married women is a blonde? [5/21] 


3. Ina hospital, 95% of the patients who suffer from cancer and 3% of those who 
do not, show a positive reaction to a cancer test. 2% of the patients in the 
hospital have cancer. A patient, randomly selected, reacts positively. What 
is the probability that this patient actually has cancer? [95/242 = 0.32 approx.] 


4. A bets against B in a game of cards. The probability that A has a better hand 
than B is 0.2 and when A has the better hand, the probability that he will raise 
the bet is 0.6. If, however, A has the poorer hand, the probability that he will 
raise is 0.1. If A raises the bet, what is the probability that he has the winning 
hand? [3/5] 


5. There are 4 possible answers to each question on an examination paper. Good 
students know 80% of the answers and poor students 40%. If a good student 
has the correct answer to a question, what is the probability that he was guessing ? 
Answer the same question in the case of a poor student. [1/17, 3/11] 


6. Four boxes contain respectively 3 red, 2 red and 1 black, 1 red and 2 black and 
3 black balls. The probabilities of selecting these boxes are respectively 0.4, 0.2, 


168 probability theory 


0.3 and 0.1. A box is selected at random and 1 red ball is drawn. (a) What are 
the probabilities that the first, second, third and fourth boxes were selected? 
(6) What are the corresponding probabilities if the ball drawn was black? 

[(@) 12/19, 4/19, 3/19, 0; (6) 0, 2/11, 6/11, 3/11] 


7. A mouse chooses at random any one of 4 mazes to escape from a box. The 
probabilities that the mouse will pass through these mazes in two minutes are 
0.4, 0.2, 0.3 and 0.5. If the mouse escapes in two minutes, show that the proba- 


bility that he chose the first maze was 2/7 and that he chose the last maze was 
5/14. 


8. A carton, containing a large number of electrical components, is known to have 
come from one of four suppliers, Brown, Jones, Green and Smith. It is known 
that Brown never supplies a defective component and that the other three supply 
5%, 20% and 50% defectives respectively. The only method of checking a 
component is by destructive testing. 

A random sample of 3 components gives 1 defective and 2 non-defectives. 
What is the probability that the carton came from Jones? [1/15] 

Recalculate the probability given the prior information that the carton itself 
was selected at random from a warehouse when there were 500 cartons from 
Brown, 2 500 from Jones, 500 from Green and 500 from Smith [1/15, 5/19]. 


5.18 Tree Diagram 


Consider a finite sequence of experiments such that the outcomes of each 
experiment, assumed finite in number, have certain probabilities. Such a 
sequence is called a stochastic process since the various outcomes depend on 
chance.* 

Since there are several alternative outcomes for each experiment, a 
sequence of such experiments will result in various series of outcomes to 
each of which it is sought to give a probability measure so that predictions 
can be made for the stochastic process as a whole. 

The study of such processes can be simplified by the use of schematic 
diagrams which show the set of all possible outcomes of the sequence of 
experiments. 

Let us consider a sequence of three experiments, the stochastic process for 
which is illustrated by the tree diagram in Fig. 5.8. 

Let the first experiment have two possible outcomes a, b only for which the 
probabilities are p,, p, respectively. Let c, d, e be the three possible outcomes 
for the second experiment assuming that a was the outcome of the first 
experiment, and let f, g be the two possible outcomes of the second experiment 
assuming that b was the outcome of the first experiment. Let h, j, k be the 


* Greek word ‘‘stochos” means “guess”. 


probability theory 


Pach h 
iz Pac j 
ap Pack k 
‘ad d I 
2s Pa m 
e n 
Ps q 
b get. fy r 
+ 


Figure 5.8 Tree Diagram 


169 


three outcomes of the third experiment assuming that a, c were the outcomes 


of the first two experiments, and so on. 


Let p,, be the probability that the outcome c occurs in the second experi- 
ment, the outcome a having occurred in the first experiment. pg, Pa, have 


similar meanings. 


Let p,,, denote the probability of the outcome A in the third experiment, 
the outcomes a, c having occurred in the first and second experiments, and 


so on. 


Since the outcomes of the first experiment are either a or b, we have 


Pat Po=1 


Similarly, for the second and third experiments, 


Pac Sr Paa a Pae ae 1 Pach + Paes “FP Pack = 1 


The probability measure of the sequence of outcomes a, c, h is P,PacPacn- 
Similarly, probability measures can be assigned to each sequence of outcomes 


which is represented by the various branches of the tree diagram. 


The sum of the probabilities in all those branches for which the outcomes 


of the first two experiments were a and c is 


PaPacPach le PaPacPaci “e PaPacPack = PaPac(Pacr + Paci ar Pac) = PaPac 


170 probability theory 


This is the probability of obtaining the outcome c in the second experiment 
having obtained the outcome a in the first. Similar results would be obtained 
for any other pair of outcomes in the first two experiments. 

Again the sum of the probabilities in all those branches for which the 
outcome of the first experiment was b is 


PoPos + PoPoo = Po(Pos + Poo) = Po 


EXAMPLE 5.19 
The percentages of the electorate in two towns A and B who vote Conserva- 
tive, Labour and Liberal are as follows: 


Town Conservative Labour Liberal 
A 50 40 10 


B 30 50 20 


One of the towns is chosen at random and two voters are chosen randomly 
and successively from that town. Construct a tree diagram and estimate the 
probability that (a) both voters are Labour, (5) the second voter chosen was 
Liberal. 


Solution (a) The required probability is the probability of selecting town 
A and then selecting 2 Labour voters from it plus the probability of selecting 
town B and then selecting 2 Labour voters from this. From Fig. 5.9, it can 


Lib. 


Figure 5.9 


probability theory 171 


be seen that 
Required probability = (0.4)? + 3(0.5)* = 0.205 


(b) The required probability is the probability of selecting two Liberals plus 
the probability of selecting 1 Conservative then 1 Liberal plus the probability 
of selecting 1 Labour then 1 Liberal, i.e. 


{4(0.1)? + $(0.2)"} + {2(0.5)(0.1) + $(0.3)(0.2)} 
+ {4(0.4)(0.1) + 4(0.5)(0.2)} = 0.15 


EXAMPLE 5.20 
(A) A box contains a sample of 12 electrical components produced by a 
machine; 4 of the components are known to be defective; 3 components are 
drawn successively at random from the box, without replacement. Draw the 
tree diagram and find the probability that 

(a) 2 or more defectives are drawn 

(b) precisely 1 defective is drawn 

(c) if the first component drawn is defective, all three are defective. 


(a) The branches of the tree which lead to 2 or more defectives are GDD, 
DGD, DDG, DDD. The required probability is the sum of the probabilities 
in these branches, i.e. 


xXx Potex a xX td xX Xx ots X AX Ye = 13/55 


Figure 5.10 D = defective G = good 


172 probability theory 


(b) Those branches which lead to precisely 1 defective are GGD, GDG, DGG. 
The required probability is 


XxX vote x eK +4 eX Te = 28/55 


(c) If the first defective has already been drawn, the probability of drawing 
2 more defectives is 


TrX Ys = 3/55 


(B) Two urns X and Y contain respectively 3 black and 2 white balls and 2 
black and 4 white balls. A ball is drawn at random from X and placed in Y 
and a ball is then drawn at random from Y. Find the probability that (a) both 
the balls drawn were of the same colour, (bd) if the second ball drawn was 
white, the first ball was black. 


Urn Y 8) 


Urn X ra 


Urn Y 7 


es 
7 


SW 
Um Y roe 


Figure 5.11 B= black W = white 


(a) Required probability is the sum of the probabilities for BB and WW, i.e. 
eX F+4 x $= 19/35 

(5) The probability that the event B is followed by the event W is 
PBAW)=%x4 


Required probability is 


probability theory 173 


5.19 Independent Trials 


By adopting certain assumptions concerning the outcomes of chance experi- 
ments, the generalized treatment of stochastic processes, outlined in the last 
section, can be simplified to provide a mathematical analysis of the proba- 
bilities associated with experiments which occur in practice. 

In this book, we shall be concerned with only two types of practical 
stochastic processes. In the next section, we shall consider Markov chain 
processes which have applications to biological and social sciences. In this 
section, independent trials processes will be considered. 

Suppose that one particular experiment is repeatedly performed in such 
a way that the outcome of any one experiment has no effect on the outcome 
of any other. Let there be r outcomes 4), ag, ... 5 4, for each of these experi- 
ments occurring respectively with the probabilities p,, pg, ..- , pp Which are 
constant from experiment to experiment. Such a sequence of experiments 
comprises a process of independent trials. 

In a series of independent trials, for each trial there are two outcomes only, 
which will be arbitrarily called “‘success” and “failure”. The outcomes occur 
with constant probabilities p and q, where p is the probability of success and ¢ 
that of failure and (p + q) = 1. The probability distribution of x successes 
and (n — x) failures in n trials is the Binomial Distribution discussed in 
Section 5.12. A tree diagram for n = 3 is shown in Fig. 5.12. 

It is required to calculate the probability of x successes and (n — x) 
failures in n independent trials. Suppose that the tree diagram in Fig. 5.12 
were extended to illustrate the case of m independent experiments. The 
required probability is the sum of the probabilities in all those branches which 


Fi can 
Figure 5.12 S = success F = failure 


174 probability theory 


pass through x branch points marked S and (n — x) marked F. The proba- 
bilities in each of these branches would clearly be p*g"~* since every segment 
going towards an S has a probability p and every segment going towards an 
F has a probability q. It remains to find the total number of branches bearing 
the probability p*g”-*. To each such branch, there corresponds an ordered 
partition of the integers 1, 2,... , m into two cells, x integers in the first and 
(n — x) in the second. The integers in the first cell give the numbers of those 
trials for which S occurred and those in the second cell for which F occurred. 
The number of such partitions is 


a ah ote 


so that the probability of x successes in n trials is 


(") p’q"* — (see eqn. 5.36) (5.61) 
x 

Let us now consider the general case for independent trials with more than 
two outcomes. For n trials, we wish to calculate the probability that there are 
nm, occurrences of the outcome aj, mp of d,,..., m, of a, where n = n, + 
ty + 2 eRe 


a, Qa, 
Py v4) 
P2 P a; a 
a. a. ! 
fie ae ee Sai ae Py 
P2 a a. ay, P2 a> P2 a, 
P3 2 3 D3 R ay P; 
a; a, P, a, a; 
P, 
a; 
Figure 5.13 


Figure 5.13 illustrates part of the tree corresponding to the particular case 
when m = 6 when there are three outcomes 4@y, a, a3 for each trial. Suppose 
that we wish to calculate the probability that there are 2 occurrences of ay, 
1 of a, and 3 of ag in 6 trials. The heavy line shows a branch of the tree 
diagram corresponding to such a group of occurrences. The branch proba- 


bility is p;*pops°. 


probability theory 175 


There are other branches of the tree corresponding to 2 occurrences of a,, 
1 of a,, and 3 of a3, and each branch has the probability p,*p.p,°. It remains 
to find the number of such branches. This number is the number of partitions 
of the integers 1, 2,..., 6 into three cells containing respectively 1, 2 and 3 
of these integers. Thus the number of branches is 6!/(1! 2! 3!) so that 


: 4: oday: (GE glottis 
Required probability = i213! Pi PaPs 


For n independent trials, it readily follows that the probability of x, 
occurrences of @,, X2 Of ag,..., x, of a, (where n = x, + X2 +°** + x,) is 


n! 
I (Mrs Xa» - - » Xp) = Pia * Pr” (5.62) 
Hy liXg lie Xp! 


This probability function is known as the Multinomial Distribution 
since (5.62) is the general term of the multinomial function 


yt Dy OF py” 


just as the binomial probability function is the general term of the binomial 
function (p + q)”. 

As a special case of eqn (5.62), we may, for example, only be interested 
in finding the probability of x, occurrences of a,, x, of a, irrespective of the 
numbers of individual occurrences of the other possible outcomes. In this 
case, the process may be considered to be a sequence of n independent trials 
having outcomes a, a2, a3’, when a,’ represents any one of the outcomes 
3, 44, ...,4,, these three outcomes occurring respectively with probabilities 
P1» Pa» Ps’ Where Ps = 1— (pit po). 


Number of occurrences of a,’ = x, = n — (xX, + Xe) 


n v1, 20x03" 
eee ow?) Pi'P2 Ps (5.63) 


Required probability = 
X1! X_! x5’! 


EXAMPLE 5.21 

(A) A fair die is thrown 10 times. Find the probabilities that 
(a) each face except the six comes up twice, 

(b) the six comes up exactly 3 times and the one exactly twice. 


Solution (a) The probability that each face comes up in any trial is 1/6. 
By eqn (5.62), the required probability is 


101 /1\"° 
f(2, 2,2, 2, 2,0) = (2) = 0.0019 
(2!) \6, 


176 probability theory 


(b) The probability of a throw different from 1 and 6 is 4/6 and there are 
5 such occurrences. By (5.63), the required probability is 


! 3 2 5 ! 10 
312! 5!\6/ \6/ \6/ 312! 5!N6 
(B) The probabilities that a football team will win, lose or draw, are respec- 


tively 0.7, 0.2 and 0.1. What are the probabilities that in four matches, the 
team will win exactly three matches and less than three matches? 


Solution If the team wins exactly three matches, the result of the remaining 
match is either a loss or a draw. Therefore the required probability is 


f@, 1,0) + f(3, 0, 1) = nar {0.1)(0.2)' + (0.7)°(0.1)} 
= 4(0.7)%(0.3) = 0.412 


For the team to win less than three matches, the required probability is 


F(0, 4, 0) + FO, 3, 1) + FO, 2, 2) + FO, 1, 3) + FO, 0, 4) 
+ f(1, 3,0) +f, 2,1) + fC, 1,2) + fC, 0, 3) + £2, 2, 0) 
+ f(2, 1, 1) + f(Q, 0, 2) 


It is of course, simpler to calculate this probability by first calculating the 
total probability that the team wins exactly three and exactly four matches 
and subtracting from 1. 

The probability of winning exactly four matches is 


£(4, 0, 0) = (0.7)* = 0.2401 
Therefore the probability of winning less than three matches is 


1 — (0.412 + 0.240) = 0.348 


EXERCISES 5.8 
1. In a tennis tournament a man plays 3 matches alternately against two other 
players A and B. The probabilities that he will win against A and B are respec- 
tively 1/4 and 2/3. He plays A first. Find the probabilities that he will win (a) all 
3 matches, (b) precisely 2 matches, (c) 2 consecutive matches. 
What are the corresponding probabilities if he plays B first ? 
[1/24, 13/48, 7/24; 1/9, 4/9, 2/9] 


probability theory 177 


2. The probabilities of winning on two gambling machines A and B are respectively 
1/4 and 1/5. A player does not know which machine is A and selects a machine 
at random. Draw a tree diagram for the first game and find the probabilities that 
the player (a) wins, (6) chose machine B if he eventually lost. [9/40, 16/31] 


3. In exercise 2, if the player plays twice, find the probability that he will win at 
least once if (a) he plays the same machine twice, (6) he changes machines if he 
loses the first time. [319/800, 2/5] 


4. In a certain town, 50% of the electorate vote Labour, 35% vote Conservative 
and 15% vote Liberal. 6 of the voters are chosen at random, What is the proba- 
bility that equal numbers support each party? [0.062] 


5. Three dice are thrown simultaneously. What is the probability of a total throw 
of 6? [5/108] 


6. A bag contains 3 red, 2 white and 5 black balls. 6 balls are drawn successively 
with replacement. Find the probability of drawing 2 red, 1 white and 3 black 
balls. [0.135] 


7. A die is rolled 3 times. What are the probabilities of (a) a particular double, (5) 
any double? What is the probability of any three singles? [1/72, 5/12, 5/9] 


8. In a game of chance, a ball is tossed into boxes which are numbered 1, 2, 3 and 
4. The probabilities that the ball will fall into these boxes are 0.5, 0.25, 0.15 and 
0.10 respectively. For each ball thrown into a box, the player receives as many 
pounds as the number on the box. Show that the probability that a player wins 
£5 or more in two throws is 0.2875. 


9. The probability that an anti-aircraft missile will destroy an aircraft is 3/5. The 
probabilities of a miss and a near miss are each 1/5. Two near misses will destroy 
an aircraft. What is the probability that four missiles will destroy an aircraft? 
[124/125] 

[Hint: First find the probability that the aircraft will not be destroyed.] 


5.20 Markov Chain Processes 


To end this chapter, we shall deal briefly with Markov decision processes 
which comprise an important class of stochastic optimization problems. 
Consider a sequence of stochastic experiments such that the outcome of each 
experiment is one of a finite number of possible outcomes 4), a2,..., 4, 
called states. The Markov process concerns a sequence such that the state 
of the system at any stage depends only upon its state at the immediately 
preceding stage. Let p,, be the probability that the system is in the state a, 
after an experiment given that it was in the state a; after the preceding 
experiment. The symbols p;,; for the various transitions are called transition 


178 probability theory 


probabilities and may be conveniently exhibited in the square transition 
matrix 


Pid Pigs apy 
Pare) Pas. °°") as 


hee, ; (5.64 
Pmi Psst Pmr : ) 


1 ae 


The sum of the transition probabilities in any row, say the mth row, 
of the matrix P is equal to 1, i.e. 


> Pmx =1 
k=1 


since, if the system is in the state a,, after any experiment, it must be in one 
of the states a,, d,..., a, after the next experiment. 


DEFINITION 5.9 
A stochastic matrix is a square matrix with non-negative elements such that 
the sum of the elements in each row is 1. 


Thus P is a stochastic matrix. Assuming that the initial state of the system is 
known, P provides enough information to construct a tree diagram for the 
process from which the probabilities of various outcomes following a sequence 
of experiments can be calculated. As will be seen, however, Markov processes 
can usually be treated more conveniently by the methods of matrix algebra. 

For simplicity, we shall consider three-state Markov processes only but the 
procedure used can be readily applied to cases of more than three states. 
For three states a,, a, and a3, the appropriate stochastic transition matrix is 


a; Gy ag 
4,/Pu Piz Pis 
P= do] Po Poe Pes (5.65) 
43 \Psi Ps2 Pas 


The Markov process is concerned essentially with problems of the following 
type. Given that the system is in the state a; initially, what is the probability 


probability theory 179 


that it will be in the state a, after n experiments? This probability will be 
denoted by pj? and these probabilities will be required for all possible 
initial and final states. For the three-state Markov chain for n experiments, 
these probabilities may be exhibited in the matrix form: 


ay as a3 

(n) (n) (n) 

4,/Pir Pisa 

(n) ( (n) (n) 
Pp™ = ag py? Pos Pas (5.66) 


(n) (n) (n) 
43 \Ps1 P3e P38 


so that if, for example, the process begins in the state a, the probabilities 
that it will be in the states a,, a,, a3, after n experiments are given by the 
elements of the first row of P\”. 

Suppose that a three-state Markov process starts by means of a chance 
device which places the system in the states a,, a, and a, with probabilities 
pi, pS? and p§ respectively. 

These initial probabilities may be represented by the vector 


,? ae (pi, ree Ps”) 


which is called a probability vector. Let p\” (j = 1, 2, 3) be the probability 
that the system will be in the state a; after n steps. Similarly, the final proba- 
bilities may be represented by the probability vector 


p” ais (mm, pi”, pS”) 


Since the probability of being in the state a, after n experiments equals the 
sum of the probabilities of being in the states a,, a, and a, respectively after 
the (m — 1)th experiment and then moving from each to the state a, after the 
nth experiment, it follows that 


(n—-1) (n—1) (n—1) 


pi” =P") Pi Pao Par Ps Pa 
Similarly 


(n) (n—1) (n—1) 


Px = Pi Pig + Pa” Pog + PS” Das 


(n) (n—1) 


4b) (n-1) (n—1) 
De Yan £5 ‘g , 


Pis + Po" Pos + Ps” Pas 
It follows that 


p™ = p/p =, php a p'"*)p® SU ie ‘sg (5.67) 


180 probability theory 


Thus the probability vector after n experiments is obtained by multiplying the 
initial probability vector by the mth power of the transition matrix P. 
Now suppose that the process begins in the state a,. In this case 


a = (i, 0, 0) 


and, by eqn (5.67), the probabilities that the process will be in one of the 
three states a,, a,, a3 after n experiments will be the elements of the first row 
of the matrix P”. Similarly, if the process begins in one of the states a, or ds, 
the probabilities after n experiments will be the elements of the second and 
third rows of the matrix P” respectively. With reference to eqn (5.66), it is 
clear that the matrix P\ is the nth power of P, i.e. P(™ = P”. 


EXAMPLE 5.22 
(A) The transition matrix for a three-state Markov chain is 


Gy ay ag 

a7o 0 1 
P=a,| 2/3 O 1/3 
a,\ 0 1/4 3/4 


Use tree diagrams for three experiments to determine the matrix P®. Verify 
by matrix multiplication that P® = P®, 


Solution Draw the tree diagram for three experiments (Fig. 5.14), first 


assuming that the process starts in the state a. 


The probability that the process begins in the state a, and ends in the state 
a, is pl? = 1x 2x §= 1/6. 


area 


Figure 5.14 


probability theory 181 
Similarly 


po) =1x 2x } = 3/16 
ps =1xtxe+1x Fx = 31/48 


Note that, as expected, p{?) + p{? + pi? = 


ut ie 

These three probabilities constitute the first. row of the matrix P. By 
drawing two other tree diagrams beginning from the states a, and as, the 
elements of the second and third rows of the matrix P“ may be calculated. 
It will be found that 

a a, as 
a, /1/16 3/16 31/48 
P‘) = a} 1/18 11/48 103/144 


a,\1/8 31/192 137/192 


Ov Oh ee 0 1/4 3/4 
Mow be a0 2/3..0) WS syte On 1/3, =) | Orn 8/12 11/12 
0 1/4 3/4 0 1/4 3/4 1/6 3/16 31/48 
so that 
Om Or a oO 1/4 3/4 1/16 3/16 31/48 
m—=pz5 («Ol (1/3 O 1/12 11/12) =] 1/18 11/48 103/144 
0 1/4 3/4] \1/6 3/16 31/48 1/8 31/192 137/192 


thus verifying that P = P%, 


(B) The transition matrix for a three-state Markov process is 


ends: 0 
P=[0 2/3 1/3 
1/4 1/2 1/4 


The initial probability vector is p = (1/2, 1/4, 1/4). Find the proba- 
bilities for the three states after two experiments. 


182 probability theory 


Solution 


ep 0 
p?— [1/12 11/18 11/36 
5/16 11/24 11/48 


The probability vector after two experiments is 


1 0 0 
p® = pp? = (1/2 1/4 1/4), 1/12 11/18 11/36 
5/16 11/24 11/48 


= (115/192 77/288 77/576) 


5.21 Linear Transformation of Vectors 


From eqn (5.67) it is seen that the vector p\~? is transformed into the 
vector p'”) by multiplying the former by the matrix P. It follows that successive 
multiplications by P transform p successively into p™, p®, p®, ... . This 
is an example of the linear transformation of vectors already mentioned in 
Chapter 9 of volume 1. 


DEFINITION 5.10 
A stochastic matrix is said to be regular if some power of the matrix has all 
its elements positive. 


For a regular stochastic matrix a probability vector r can be found which 
transforms into itself when multiplied by P, i.e. 


re — Fr 


If the elements of r are interpreted as the coordinates of a point in Euclidean 
space, then r becomes a fixed point of the transformation P. 

The fixed point r may easily be found for a regular 2 x 2 stochastic matrix 
of the form 


probability theory 183 


for suppose r = (x;, y;) is the fixed point, then 
l—a a 

b 1-6 
(1 —a)x, + by, =x, and ax,+(1—b)y,=), 


(x1 ro = (%1 ys) 


Each of these equations reduces to ax, = by, and since x, + y, = 1, we have 


2 and y,=—— 
: a+b a+b 


Thus (b/a +- b, aja +- b) is the unique fixed point of the transformation. 
If, in the general case, the initial probability vector p is identified with 
r then 


p” = pp" = rP” = rp" =-:---= r= p 


so that we have a stationary Markov process in which the probabilities of the 
various states at any stage of the process are the same as those at the initial 
stage, namely r. 


The following theorems are stated, without proof. 
If P is a regular stochastic matrix then, as n — oo, 
(1) P” — R, a matrix of which each of the rows comprises all the elements 
of a unique probability vector r. 
(II) pP” — r where p is any probability vector. 
The second theorem implies that no matter what the initial probabilities 
p of the system may be, after a large number of experiments, the proba- 
bilities of the various states will be given by the vector r since 


p” = pp" +r 


Most problems on Markov chains may be solved by applying this important 
result. 


EXAMPLE 5.23 
Show that the matrix 
1/2 1/4 1/4 
Pi 2153/5" “0 
V2 0 1/2 


184 probability theory 


is regular and show that its fixed point is r = (8/17, 5/17, 4/17). Calculate 
P?, P®,... and verify that P* > R. 


Solution 
1/2 1/4 1/4\ /1/2 1/4 1/4 19/40 11/40 1/4 
PB? = [2/5 3/5750 2/5 3/5 O | = | 11/25 23/50 1/10 
12 -O.0de/) 12) © Wap 1/2 1/8 3/8 


All the elements of P® are positive so that P is a regular stochastic matrix. 
Let the fixed point be r = (x,, y,, 2,) so that 


1/2 1/4 1/4 
(1 yi 4){ 2/5 3/5 0} =(% Vi 21) 
t/2) (0! 1/2 
Therefore 
$x, + 51 + $23 =X 
2x, + 31 — Abe 
xy +34 =% 
and x.+ y+ 4=1. 


The solutions of these equations are x, = 8/17, Wi = 5/17, 2, = 4/17, 
so that 


r = (8/17, 5/17, 4/17) or (0.47, 0.29, 0.24) approx. 


Now 
05 0:25 0:5 0.48 0.28 0.25 
P=/[04 06 0 and P?— 1] 0.44 0.46 0.10 
OS SOTTO S 0.50 0.13 0.37 


It may be verified that 


189/400 227/800 39/160 0.47 0.28 0.24 
P? = | 227/500 193/500 4/25 |= [0.45 0.39 0.16 
39/80 1/5 5/16 0.49 0.20 0.31 


so that there is evidence that P” — R. 


probability theory 185 


EXAMPLE 5.24 

Taking p™ = (1/3, 1/3, 1/3), calculate p™ and p® for the transition matrix 
P of the previous example and verify that they are tending towards the fixed 
point of P. 


Solution 
1/2 1/4 1/4 


pOP = (1/3 1/3 1/3) 2/5 3/5 0 | = (7/15 17/60 1/4) 
12 0 1/2) te (0.47 0.28 0.25) 


This is a close approximation to r. 
19/40 11/40 1/4 
pP? = (1/3 1/3 1/3)} 11/25 23/50 1/10 
‘42 838 
= (283/600 43/150 29/120) 
= (0.47 0.29 0.24) 


This is almost exactly the vector r. 


EXAMPLE 5.25 
In a series of examinations, one of three particular questions always occurs. 
The same question never recurs in successive examinations. In the next 
examination, questions 2 and 3 are equally probable if question 1 came up 
in the last examination. If question 2 came up last time, question 1 is twice 
as probable as question 3, whilst if question 3 came up last time, question 2 
is three times as probable as question 1. 

Show that, in a long series of examinations, question 2 occurs most 
frequently and with a probability of 21/55. 


Solution The transition matrix here is 
Q, 2 Qs 
Q,/0 1/2 1/2 
P—O42/5 0 13 
Q;\1/4 3/4 0 
where the three states of the Markov chain are the choices of questions 1, 


2 and 3 respectively. P is a regular stochastic matrix since all the elements 
of P? are positive. 


186 probability theory 


Whatever probability vector p specifies the initial probabilities of the 
three states Q,, Q, and Qs, after a long series of n examinations, the proba- 
bility vector p™ specifying the final probabilities of the three states will 
approximate closely to the fixed point of P. 

Let the fixed point of P be (x, y1, z,). Therefore 


Or TZ MWe 
(x1 yr %){ 2/3 0 1/3} =O ni 2) 
1/4 3/4 0 


Therefore 


$x, +42, = X41 
$41 + 924=)1 
$x, + 3y1 = "41 


and 
MtNt+yA=1 
x, = 18/55 Vu = 21/55 Zz, = 16/55 


It follows that question 2 occurs most frequently. 


EXAMPLE 5.26 

The probability of winning with a certain type of gambling machine is 1/5. 
Of two such machines A and B, A is out of order and the probability of 
winning with it is 1/4. A player does not know which machine is A but selects 
one of the two machines at random. Show that if he wins his first game, the 
probability that he had selected A is 5/9 but if he loses his first game the 
probability that he had selected A is 15/31. 


Figure 5.15 W=win L = lose 


probability theory 187 


Solution If the player wins his first game, the conditional probability that 
he had selected A is 


a! SN d Viees BE ral 
loaner Came ye EE EE 5/9 > 1/2 


If he loses his first game, the conditional probability that he had selected A is 


AA Cee 4 x 3 = 
(laa! dames ar rag ag 15/31 < 1/2 


Since the player wishes to play the more favourable machine A, he adopts 
the following system in view of the values of the above conditional proba- 
bilities: if, on a particular machine, he wins his first play, he plays the same 
machine the next time but if he loses, he plays the other machine the next 
time. What is the probability that he will win in the long run? 

Since, in a sequence of games, the decision, at any stage, as to how the 
next game shall be played depends only upon the result of the previous game, 
we have a two-state Markov process, the two states being the playing of 
machine A and of machine B. The transition matrix is 


A 8B 
A/(i/4 3/4 
B\4/5 1 a 
since, for example, if the player plays machine A, the probability that he 
will play A in the next game is 1/4 since this is the probability with which he 
wins on machine A. 
The fixed vector of this matrix is easily shown to be (16/31, 15/31) so that, 
by adopting this system, he will play 16/31 of the time on machine A and 


15/31 of the time on machine B in the long run. Thus the probability of 
winning in the long run is 


mee 1S 1 
a 3 <3 


EXERCISES 5.9 
1. Show that the stochastic matrices 


1/2 1/2 0 3 6. 28 
1/3 1/3 1/3) and [0 1 0 
0 1/4 3/4 0 2/5 3/5 


are respectively regular and non-regular. 


188 probability theory 


2. A two-state Markov process is specified by the transition matrix 
1/3 2/3 
p-( 4 
3/4 1/4 
Draw tree diagrams for three experiments and hence derive the matrix P‘), 
Verify that P? = P®), 


[> UE ieee pened | 
109/192 83/192 
3. Compute the first four powers of the matrix 
Oi 110.3 
ho ved 
and hence estimate the fixed point. 
[(1/2, 1/2)] 
4. Show that the fixed point of all stochastic matrices of the type 
r—'e c 
( ‘ ek ) (0 <e <1) 
is (1/2, 1/2). 
5. Find the fixed point of the matrix 
72 12 6 
5/16 1/2 3/16 
0 3/4 1/4 
[1/3, 8/15, 2/15] 
6. The transition matrix of a three-state Markoy process is 
OF LE 40 
0.3 0.5 0.2 
0.4 0.3 0.3 


If the initial state is specified by the probability vector (1/3, 1/6, 1/2), find the 
probabilities of the three states after two experiments. 
[p) = (0.243, 0.589, 0.168)] 


7. Every year 1% of the workers in the South East of England move to other jobs 
elsewhere whilst 3% of the workers move from other parts of England to jobs in 
the South East. Show that, after a long time, the proportions of all workers in 
the South East and elsewhere will be respectively 75% and 25% and that 2% of 
all workers will be moving in each direction. 


probability theory 189 


8. With reference to Example 5.26 if the player has played twice and has had two 
successive wins, find the probability that he had chosen machine A. What are the 
probabilities that he had chosen machine A having obtained the other possible 

airs of outcomes win-lose, lose—win and lose-lose? 
[25/41; 75/139, 75/139, 225/481] 


9. With reference to Exercise 8, the player adopts a system based on the conditional 
probabilities obtained. He plays a machine twice. If he obtains either win-win, 
win-lose or lose-win, he plays two further games on the same machine. If he 
obtains lose-lose, he changes machines. On this system calculate the probability 
of winning in the long run. 


7/16 a) 


; fixed vector (256/481, 225/481); probabilit 
16/25 9/25 er ; P y 


| Transition matrix ( 


of winning 105/81 | 


6 Boolean algebras 


6.1 Introduction 


The publication, in 1854, of a book entitled The Laws of Thought by the 
English mathematician, George Boole (1815-1864), marked a significant 
advance in the development of modern algebra. He showed that, by the 
symbolic treatment of the premises of a given proposition, the latter could 
be reduced to algebraic equations from which conclusions, logically con- 
tained in the premises, could be deduced by mathematical methods, i.e. he 
developed a system of “symbolic logic”—a general symbolic method of 
logical inference. This algebra was called “Boolean Algebra” though today 
the abstract mathematical system designated as “Boolean Algebra” is more 
precisely defined. Any Boolean Algebra will contain elements, relations and 
operations which must obey certain definite laws which will be specified below. 


6.2 The Laws of the Algebra of Sets 


The algebra of sets is an example of a Boolean algebra. The laws to be obeyed 
by a Boolean algebra will be exemplified by those obeyed by the algebra of 
sets. 


For any algebra of sets 
(a) The elements are sets. 
(b) The operations are union VU, intersection A and complementation ’. 


(c) The relations between sets or combinations of sets are equality = 
and inclusion <, 


Boolean algebras 191 


(a), (6) and (c) form the structure of the algebra of sets. Now consider the 
properties of this structure. 


1. Closure 


“Y) | GO) | 


(a) () 
Figure 6.1 (a) AUB (6) ANB 


Acland Bcf=>SAUR CT 
=A fh) po. 


i.e. if A and B are subsets of the universal set J, A U Band A A Bare also 
subsets of J. The algebra of sets is said to be closed under union and inter- 
section. 


2. Identity Sets 
(i) The identity set relative to union is a set whose union with any set, say A, 
results in the set A. The required identity set is the empty set @ since 


AUS =A=go UA 
(ii) The identity set relative to intersection is the universal set I since again 


ANI=A=INA 
AUi=Il ‘Ange ~Z 


3. Idempotent Laws 
AUVUA=A and ANA=A 


4. Complements 
If A is a subset of J and A’ is its complement, 


AUA'=I and ANA =@6 
Also 
(A)! =A o'=T '=2 


192 Boolean algebras 


The following laws are readily illustrated by Venn diagrams. 


5. Commutative Laws 


(i) Union AUB=BUA 
(ii) Intersection ANB=BOA 


6. Associative Laws 
(Gj) Union AU(BUC)=(AUB)UC 
(ii) Intersection A M(BONC)=(ANB)AC 
7. Distributive Laws 
(i) Union over intersection A U(BNC)=(AUB)N(A UC) 
(ii) Intersection over union A M(BUC)=(ANB)U(A NC) 


8. De Morgan’s Laws 
Figure 6.2 illustrates (A U B)’ = A’ 1 B’ 


eC) 


Vitineesedt 
(d) ; (c) 
Figure 6.2 (a)(AUBY (b) A’ (c) B 


Similarly (A M B)’ = A’ U B’. It will be observed that if @ and J, and U 
and ©, be interchanged where they occur in any one of the above laws, 
another law of the algebra of sets is obtained. This is the principle of duality 
by which a law gives rise to its dual. 


6.3 The Laws of Boolean Algebra 


The laws of the algebra of sets are a special case of the laws obeyed by the 
mathematical system called Boolean algebra of which the algebra of sets is a 
particular model. If the laws of the algebra of sets are generalized, the laws 
of Boolean algebra will be obtained. 

Let A, B, C,... represent abstract elements, not necessarily sets. Let the 
operations U and 1 be replaced by + and - respectively. The operations +- 
and - though called addition and multiplication must not be considered to 
possess the meanings which they have in ordinary algebra. They will represent 
operations which are defined for the model under consideration. Similarly 


Boolean algebras 193 


@ and I will be replaced by 0 and 1. The elements 0 and 1 will have only those 
properties which are prescribed by the laws. 

If these replacements be made in the laws of the algebra of sets, then the 
following laws for Boolean algebra will be obtained. 


1. Closure 
If A and B are any elements of the Boolean algebra, A + B and A - B will 
also be elements of that algebra. 


2. Identity Laws 
A+0=A 


3. Idempotent Laws 
A+A=A A:A=A 


4. Complement Laws 
A se A’ = 1 A : dd! = 0 
(Ay =A o’=1 1’=0 


5. Commutative Laws 
A+ B=B+A4 A*B=B:°A 


6. Associative Laws 
A+(B+C)=(A+B)+C 
A(B- C)=(A- BC 


7. Distributive Laws* 
A+ (B: C)=(4+ B): (44+ C0) 
A: (B+ C)=(A:B)+ (A°C) 


8. De Morgan’s Laws 
(A + B) = A’: B’ 
(A: B)’ = A'+ B’ 
*Notation In the remainder of this chapter, the brackets will be omitted from such 


products as (B - C) so that, for example, A + (B - C) will be denoted by A + B-C. 
The brackets will, however, be retained in such expressions as A -(B + C). 


194 Boolean algebras 


EXAMPLE 6.1 

Consider a Boolean algebra of sets comprising the set of four elements 

F = (I, A, A’, @) where I = (a, b, c) and A = (a, b) so that A’ = (c). 
Table 6.1 gives all the possible unions and intersections of the four elements 

and their complements. 


Oil Gr Ma bes AS A|I AA’ oO Ble) Ble,’ 
flare e flit awe 1|@ 
alt AT A A\i\sA A Ge Al 
Bk FA Oe AA’ oy As A'| A 
Oi i A A 2 GIO BO eS @\1 
TABLE 6.1 


The tables have been completed on the assumption that the identity, comple- 
ment and commutative laws hold. It may now be readily verified from the 
tables that the property of closure holds for the four elements under the 
operations U, ™ and’, and that the associative, distributive and De Morgan’s 
laws also hold. 


EXAMPLE 6.2 
Let D comprise the set of the four positive integral divisors of 14, i.e. D = 
(14, 7, 2, 1). 

If x and y are any two elements of D, let + and - be interpreted to mean 
that 


x + y is the L.C.M. of x and y and x: y is the H.C.F. of x and y 


Det x7 —14lx, 
Table 6.2 gives all possible combinations of the four elements under the 
operations +, - and ’. 


FED ES? |) Fe he ee « | Taeie 2 I xe 
14) 14 14 14 14 LA. 1 14} 1 
40) Ve SN an: ea Yl ld ye! flame ae | 2 
Z)14 14 2 2 Ail ee oe Dl 
TW Sa a Aa ms | 1 i De ae GN | 1| 14 


TABLE 6.2 


Boolean algebras 195 


It may readily be verified from these tables that all the laws of Boolean 
algebra are obeyed in this case. For example, consider the distributive law 
A:(B+C)=A:‘B+A°C. 

When A = 2, B=7, C = 14, 


A*(B4+ 0) =2-(74+14 =2-14=2 
A*B+A:C=2'74+2:14=142=2 


An examination of the tables for F = (J, A, A’, @) and D = (14, 7, 2, 1) 
will show that they have the same structure. In fact, the tables for F will be 
transformed into the tables for D if we replace 


aa A oC UW A! 


by 
poaiwe 2 2 1 + 


There is a complete one-to-one correspondence between the Boolean algebras 
of F and D, which are said to be isomorphic. 

Any statement concerning the elements of F must have a corresponding 
statement concerning the elements of D. For example 


(ANA) UA’ (7-2)+2 
=f UA’ =1+2 
= A’ i 


It will be shown later that the Boolean algebras which are applicable to 
switching circuits and to the logic of statements are isomorphic to a simple 
Boolean algebra of sets. It follows that the theory of sets may be applied to 
circuitry and logic. This example indicates an important reason why mathe- 
maticians are interested in mathematical structures. 


6.4 Binary Boolean Algebra 


The simplest Boolean algebra contains only two elements, 0 and 1. With 
reference to the Laws of Boolean algebra, this means that the variables 
A, B, C,... each take the values 0 and 1 only. 

This system is called binary Boolean algebra. A binary Boolean algebra 


196 Boolean algebras 


with two variables A and B under the operations +-, - and ’ is defined by 
Table 6.3 ; 


A 
Oo 4 ce Pees Gate ls 
0;0 1 ONEOF Oi) Orr 
B B 
yet Lord FO 
TABLE 6.3 


An algebra of sets consisting of two elements @ and J is a model of a 
binary Boolean algebra, and union, intersection and complementation are 
defined by Table 6.4. 


WA reall (ia af Ele. | Ele.’ 


ON NO LS NN NS TE 
Pie 


UT Ue a 
TABLE 6.4 


The complete one-to-one correspondence between these systems, i.e. their 
isomorphism, is obvious. It may readily be verified that all the laws of 
Boolean algebra are satisfied by such binary systems, e.g. De Morgan’s 
Laws are verified in Table 6.5. 


B’ A'-B (A +B) (A-By 


or 
0 
1 
1 
1 


TABLE 6.5 


From the last four columns, it will be seen that 
(A + B)’ = A’: B’ (A: B)'=A'+ B’ 


EXAMPLE 6.3 
By applying the laws of Boolean algebra, show that 


(i) (4 BY + (4’- By =1 
(ii) (A+ B)-(A4+ B)=A 
(iii) B: {((C + D)’ + B}=B 


Boolean algebras 197 


(i) (A's BY + (4'- BY =A+B+A+B’=(A+A)+ (B+ B) 
=A+iel 


(ii) (A + B)- (A+ B) = (A+ B)-A+ (A+ B)-B’ 
=A-A+B-A+A-B'+B-B 
=A+B-A+A-B'+0 
=A+A:°(B+B))=A+A‘1=A+A=A 

(iii) B- {((C + D)' + B} = B-(C’- D'+ B)=B-C'-D'+ BB 

=B-C'-D'+B 
=B:(C'-D'+1)=B:1 
=B 


EXERCISES 6.1 
1. Sis the set of integral divisors of an integer n. For any elements x, ye S, x + y 
and x - y are respectively defined to be the L.C.M. and the H.C.F. of x and y 
and x’ = n/x. 
Determine whether S is a model of a Boolean Algebra under these operations 
if n is (a) 6, (6) 10, (c) 12. 


2. If A, B and C take the values 0, 1 only, evaluate the following polynomials by 
tabulation as at the end of Section 6.4. 
(i) A: B’+C (ii) (4+ B)-(4+C) (iii) A-(4-B’+A)+A-B 
(Note that for three variables A, B, C, there will be 2° = 8 possible groups of 
values.) 
3. Use the tabular method to verify the following results for a binary Boolean 
algebra 
(i) A+A*‘B=A 
(ii) A + B= (A': B’)! = A-B=(A' +B’! — (De Morgan’s Laws) 
dil) A+ B-C=(A+B):(A+C) A:‘(B+OC)=A:'B+A-C 
(Distributive Laws) 
(iv) (A + B):(4' + C)-(B+ 0) = (A+B): (4 +0) 
4. By applying the laws of Boolean algebra, show that 
(i) A: (A+ BY =A-B’ 
(ii) (A: B’+C+A’)-B=B-(C+4) 
(iii) A- B + A’: B’ = (A' + B)-(A 4+ B’/) 
(iv) A+ A’-C+B=A+B+C 
(vy) (4+ B):(44+04+B:°(B'+C)=A:C4+B 


198 Boolean algebras 


6.5 Switching Circuits 


A binary Boolean algebra may be interpreted as “the algebra of circuits”. 
In recent years, this algebra has been widely applied in the design of electronic 
computers and telephone dialling systems. 

Suppose two electrical terminals are connected through a number of 
switches A, B, C, . . . in series (Fig. 6.3). When a switch is closed, current may 

A B c 
Gah See a ee Se "<a 
Figure 6.3 Series 


pass through it but, when open, no current flows. Thus current flows between 
the two terminals only when ail the switches A, B, C,... are closed. For 
switches A, B, C,... in series we write A-B-C..., a conjunction of 
A, B, C,.... If the terminals are connected by switches A, B, C,... in 
parallel (Fig. 6.4), the current flows if at least one of the switches is closed. 


Figure 6.4 Parallel 


For switches A, B, C,... in parallel, we write A+ B+ C+ °::, a dis- 
junction of A, B,C,.... 

Let 1 denote that a given switch is closed and let 0 denote that it is open. 
Consider the cases of a pair of switches A and B in series and in parallel (Fig. 
6.5). For various positions of the switches A and B, all the cases which arise 
are listed in Table 6.6. 


epee nee tk 
(a) 


B 


() 
Figure 6.5 (a) Series: AB (b) Parallel: A+B 


Boolean algebras 199 


Parallel A+B Series A-B 
Switch A Switch B (Disjunction) (Conjunction) 


1 1 


1 1 
0 1 
0 0 


TABLE 6.6 


For two switches, there are 2 = 4 possible conditions. These are listed in the 
four lines in the first two columns of the table. The appearance of 1 in the 
column headed ‘‘Parallel” indicates that, for a particular arrangement of the 
switches A and B, the parallel circuit passes current, i.e. is closed. An entry of 
0 means that the contrary is true. The appearance of | and 0 in the column 
headed “Series” is similarly interpreted. 

Let A’ (the complement of A) be a switch which is open when 4 is closed 
and vice versa, i.e. A’ has the value 0 when A has the value | and vice versa. 

These results may be tabulated as in Table 6.7. 


A A 
+/0 1 6 On A|A’ 
0|;0 1 0|;0 0 Og 
B B 
U4 es ae nO) od 110 
Parallel Series 
TABLE 6.7 


The algebra of circuits has precisely the structure of the simplest binary 
Boolean algebra in two variables A and B, taking the values 0 and | only, 
which was introduced above. It is likewise isomorphic with the algebra of sets, 
consisting of two elements @ and J, referred to above (Tables 6.3 and 6.4). 

The laws of Boolean algebra may therefore be applied to the variables 4, 
B, C,... of the algebra of circuits, for example 


A+B:C=(A+8B):(A+C) 


Interpreting this law of Boolean algebra in terms of electrical circuits, the 
circuits in Fig. 6.6 must be equivalent. 


A A A 
a So aa 
ae i | 
Eee) on * c 
A+ BSC (A+B)*(A+C) 
Figure 6.6 


200 Boolean algebras 


EXAMPLE 6.4 
Design a circuit to put a light on a staircase on and off by two switches A and 
B at the top and bottom of the staircase. 


Solution Mathematically speaking, the circuit must satisfy the conditions 
tabulated in Table 6.8 


Required condition 
of circuit 


1 (light on) 


0 (light off) 
0 (light off) 
1 (light on) 


TABLE 6.8 


Consider the values of the Boolean polynomial A+ B + A’: B’ for these 
values of A and B. It will be seen from Table 6.9 that they meet the require- 
ments of this problem. 


TABLE 6.9 


It should be noted that, for all conditions of the switches A and B, the 
conjunction A - B has the value 1 only when A = B = 1, being otherwise 0. 
The conjunction A’ - B’ has the value 1 only when A = B = 0. It follows that 
A-+B-+ A’- B’ must satisfy the conditions of the problem. A - Band A’: B’ 
are called the basic conjunctions for A = B = 1 and A = B = O respectively. 

We must therefore construct the circuit corresponding to A - B+ A’: B’. 
It is shown in Fig. 6.7. 


A B A B 


Figure 6.7 


Boolean algebras 201 


However since 
A:B+ A’: B’=(B+A4):(A+ B) 


because A’: A = B: B’ 


it follows that an equivalent electrical circuit to that in Fig. 6.7 is the one 
shown in Fig. 6.8. 


A A 
B B 
Figure 6.8 
EXAMPLE 6.5 


What electrical network is equivalent to the Boolean polynomial 
(A+ 8)-C’+A4-C+8B 


By simplifying this polynomial, find the simplest equivalent circuit and 
verify by tabulating all possible cases. 


Solution The corresponding network is shown in Fig. 6.9. 
A 
he 
B 


AiG 


B 
Figure 6.9 
Since 
(A+ B):C’+A'-C+B=A:C'+B:C'+A''C+B 
=A-C'+A'-C+BI1+C 
=A:C'+A'-C+B 


202 Boolean algebras 


Figure 6.10 


the equivalent network is as shown in Fig. 6.10. The three switches A, B, C 
can be set in 2° = 8 possible ways. From Table 6.10 it will be seen that, for 
each of the eight possible settings of the switches, the values of the poly- 
nomials (A + B): C’+ A’-C+ Band A:C’+ A’:C + B agree. 

This problem may be solved directly from a table giving all possible settings 
of the switches and the corresponding conditions which the circuit must 
satisfy by using an appropriate disjunction of the basic conjunctions for three 
switches. 

IfA = B = C = 1 (first line of Table 6.11) then the conjunction A + B+ C = 
1. For all other possible values of A, B, C, then A> B-C=0. 

This may be interpreted electrically as follows. If the switches A, B, C are 
all closed and for no other settings, the circuit represented by A - B- C, i.e. 
—A—B—C— will be closed. 

Similarly the basic conjunction A - B- C’ has the value 1 when A = B = 
C’ = 1 (C = 0) but is 0 for all other settings of the switches. It follows that 
the polynomial 


A-B-C+A:B°C'+A:B°C'+A'* BC 
+A’>B- C+ ABC 


has the value 1 for all settings of the switches except for those in lines 3 and 8 
where it has the value 0. The above polynomial therefore satisfies the con- 
ditions of this problem and represents the required circuit. 


a4 cv he AC 
C’ | A+B |(A+B)-C’| A’-C 4B 


~ 
bs 
a) 
* 


oorrkrCOOre 
orm or Oro 
ee OOOO 
- OF Or Oor 


1 
1 
1 
1 
0 
0 
0 
0 


CORR RR EH 
orceoocoo 
Or RR OOF 
One R ROR 
cooororo 
CORR RR One 


TABLE 6.10 


Boolean algebras 203 


Desired condition of | Corresponding 


Bi Cc the circuit basic conjunction 
1] 1 1 
1 | 0 1 
0} 1 0 
0/0 1 
Balt 1 
1|0 1 
0} 1 1 
0/0 0 
TABLE 6.11 


The polynomial may now be simplified by the laws of Boolean algebra 
thus: 


AC’: (B+ B’)+ A’-C- (B+ B)+B-(A-C+A':C) 
=A:‘C’+A'-C+ B:(A'C+A’-C’) 
=A:(C’+ B-C)+A’-(C+B-C’) 
=A (CTR) (CFC) + A’ (C +B) (C+ C) 
= A: (C’ + B)+ A’*(C + B) 
=A-:C’+A'-C+B°(A+ 4’) 
=A-C’+A'-C+B as before 


EXERCISES 6.2 
1, Draw the circuits represented b 
(a) (A+ B)-(A+C) (6) A+(B-C) 
(c) (A + B’)- (A’ + B): (A’ + B’) 
(d) {A-B + C’}-{(A4’ + C)- B} 
(e) (4+ B’+C’)-A'+(B+C’)-D 


2. What polynomials represent the circuits in Fig. 6.11 ? Show that these polynomials 
may be simplified to 
(i) A-B’-C (ii) A+ B-C (iii) B-C (iv) A-B 
and hence draw simpler equivalent circuits. Verify by tabulation. 


3. Design a circuit for a light at the foot of a staircase which is to be operated by 
three switches on different floors of a house. 


4. Design a circuit for four switches controlling an electric lamp which is to light if 
(a) two or more switches are closed, (6) two or less are closed, (c) exactly two 
are closed. 


204 Boolean algebras 
ks 
+ 
B 
A A 
« CHO 
B Cc. 
A B’ 
ee 
B C 
A—=-B8 
a wr 
A’ lk A—- 
B 


Figure 6.11 


(i) A 


5. Each of four men in a rocket may operate his own switch controlling two lights. 
Design a circuit so that (a) a “‘safe” light goes on if all is ready to proceed, (6) 
a “danger” light goes on if anyone is signalling “‘danger”’. 


6. A three-man Committee wishes to employ an electric circuit to indicate a secret 
simple majority vote. Design a circuit so that each member may push a button 
to indicate his “tyes” vote (not push it for a “tno” vote) and so that a light will 
go on if the majority vote “‘yes”’. 
[4°:B-C+A°B-C'+A-B':C+A'*B-C=A:B+A-C+C:A'-B) 


6.6 The Algebra of the Logic of Statements 


In the algebra of sets, symbols have been used to represent sets and elements 
of sets. Symbols have also been used to represent electrical switches thereby 
formulating a Boolean algebra of “switching circuits”. We shall now employ 
symbols in /ogic to represent statements or propositions such as “Miss X is a 
University student” or “Miss X is beautiful”. 

In this algebra, a statement is either true (T) or false (F). Sentences like 
“How do you do?” or “There are living creatures on Mars” are not state- 
ments in this context since their “truth values” are meaningless or doubtful. 

Statements will be designated by p,q, r,... and will be used to form com- 
pound statements under the operations A V and ~ which mean “and”, 
“or” and “not” respectively. These operations are analogous to - + and ’ of 
the binary Boolean algebra. The truth values T and F of statements are 
analogous to the values | and 0 taken by the variables of the binary Boolean 
algebra. The algebra of logic is isomorphic with the binary Boolean algebra. 


Boolean algebras 205 


Let p denote the statement “‘Miss X is a University student”’. Its negation, 
denoted by ~p, would represent the statement ‘‘Miss X is not a University 
student”. If p is true (7), ~p must be false (F), and vice versa. Let g denote 
the statement “‘Miss X is beautiful”. The compound statement “Miss X is a 
University student and Miss X is beautiful” is denoted by p Aq (p and q). 
The statement p A q is called the conjunction of p and q and is only true if both 
p and q are true and is otherwise false. 

The compound statement p or g is denoted by p Vg and is called the 
disjunction of p and q. Clearly p V q is true when at least one of the statements 
p and q is true and is false only when both p and gq are false. (The use of the 
terms “‘conjunction” and “disjunction” here are directly analogous to their 
use in Section 6.5 for a pair of switches in series and in parallel.) 

The various possibilities are summarized in the “truth tables” (Table 6.12). 


P\T| PAY P\GT|PY9 Py SP 
100 (0 es FR f PSY (a FO r| FF 
[Py ja ao ee Bh) 3) fine 1 sa |e! 
Yc Re dad Vl Bi ha ae 
Fir | F ey rey ee 

Conjunction Disjunction 

TABLE 6.12 


The isomorphism between the algebra of logic and the binary Boolean 
algebra is further exhibited by a comparison of Table 6.12 with the first four 
columns of Table 6.5. 

Several compound statements of the types p Ag and p Vg may be com- 
bined by using the basic connectives A V ~ and others (described below) to 
form more complicated compound statements, such as 


~{(PVg) A (~pvV ~9} 


The truth tables of such compound statements may be determined by a 
routine procedure which is illustrated for the above statement in Table 6.13. 


~{(pvqg)a 
~P |~ 7 |(P V9) | (~P VY ~9) | (PYG) A (~p V ~@)| (~Pan~g} 


TABLE 6.13 


206 Boolean algebras 


CONDITIONAL STATEMENTS 
Let us now consider another connective between two statements p and q. 
Let p denote the statement “‘you work hard” and let g denote the statement 
“you will pass your examination”. 

The compound statement “if you work hard you will pass your examina- 
tion” will be denoted by p=-q meaning “if p then gq’. The conditional 
statement p => q is defined by truth Table 6.14. 


TABLE 6.14 


If p is the hypothesis and g the conclusion, the conditional statement 
P = ¢ is regarded as being false only when a false conclusion has been ob- 
tained from a true hypothesis, i.e. p => q is false for the second line only. In 
the third line, the statement p = q is considered to be true since no assertion 
has been made about what happens if “‘you do not work hard”. 


BICONDITIONAL STATEMENTS 

The connective <> is used to denote biconditional statements. Thus p<>g is 
read “if p then g and if qg then p” or “p if and only if q” and has the truth 
Table 6.15. 


TABLE 6.15 


From Table 6.13 it will be seen that ~{(p V q) A (~p V ~@)} has the same 
truth values as p<>q and is said to be equivalent to it. 

Compound statements may be formed from three or more simple state- 
ments using any of the five basic connectives V A ~ =<> 


{p> (q>n3>{~p>O)=>(p>n} 


is an example of a compound statement involving three simple statements 
p, q and r. Its truth table is worked out in Table 6.16. There are 2? = 8 


Boolean algebras 207 


possible groups of truth values for the statements p, q and r so that the table 
has eight rows. If more than three simple statements are involved, there will 
be correspondingly more rows but the routine procedure for determining the 
truth table is unchanged. 


{(p>q) | {p=@=n} 


ies) Pg iperi ee Gs). pen > {p> gene! 


q 
T: 
Yi 
F 
Ee 
de 
if 
E 
Ji 


P 
Zi, a: 
bi i 
rT i 
ly F 
F Tr 
FE iF 
FE Er 
B; i, 


SADA 
AANA 
SSSA 
NNSA 


TABLE 6.16 


It will be observed that the compound statement is true for all possible 
truth values of p, q and r. Such statements are said to be Jogically true. 


CONVERSE, INVERSE AND CONTRAPOSITIVE PROPOSITIONS 
Having shown something of the process of the formalization of plain language 
into symbolic logic, let us now see how this process can help to eliminate 
some of the illogical reasoning which we meet so often in everyday life. 
Consider the following compound statement: 
“If this University is a British type of University, then this University is a 
good University”. 
Let p represent the first component and qg the second. Then the above 
implication may be written symbolically as 


Pore 
Consider the following three variants of this implication: 


q=>p “If this University is a good University, then this University 
is a British type of University’? (Converse) 
~p=>~q “Ifthis University is not a British type of University, then this 
University is not a good University’’ (Inverse) 
~q=~p “Ifthis University is not a good University, then this University 
is not a British type of University’ (Contrapositive) 


Let us work out their truth tables using the Table 6.16. The result is given in 
Table 6.17. Since (p = gq) and (~g => ~> ) are both true or both false, i.e. 
have the same truth values irrespective of the values of p and q, it follows 


208 Boolean algebras 


dete Oe Sea g 


TABLE 6.17 


that (p => q) and (~q => ~>) are equivalent, i.e. a proposition is equivalent 
to its contrapositive. This equivalence relationship is called a tautology. 

Similarly (q=>p) and (~p=>-~g) are equivalent, i.e. the converse is 
equivalent to the inverse. 

However, (p => q) and (~p = ~@) are not equivalent since the two com- 
ponents do not have the same truth values. If, therefore, we assume that 
p= holds, it does not always follow that ~p = ~gq. That is, for example, 
it is not necessarily true that “If this University is not a British type of 
University then this University is not a good University”. In fact, it is 
logically incorrect to draw such a conclusion. Unfortunately, illogical 
reasoning of this kind is all too common. 

The tautology between (p=>g) and (~g=--~p) is of considerable 
importance in mathematics, being widely used in the method of indirect 
proof. If there is difficulty in proving the proposition (p => 4) directly, it may 
often be proved more easily in the equivalent contrapositive form, 
(~7 > ~?). 

The algebras of sets consisting of two elements J and @, of circuits and of 
logic are all models of the binary Boolean algebra. The isomorphism between 
the structures of these algebras is exhibited by the examples of the one-to-one 
correspondences between them which are tabulated in Table 6.18. 


Circuits 


Parallel | Disjunction 


Series | Conjunction 
Open 
Closed 1% 


TABLE 6.18 


EXERCISES 6.3 
1. If p = “I should like to go swimming today” 
q = “I should like to play tennis today” 
Write the following statements in words: 
(a) ~ ()pvq ()prq G)~(prg ©) ~prAgq 


(f) (~pagv(pa-~g) 


Boolean algebras 209 


10. 


. Write the following statements in symbolic form, letting p be ‘‘Fred is tall’ 


and g be “‘George is tall”. 
(a) Fred is tall and George is short 
(b) Fred and George are both short 
(c) Either Fred is tall or George is short 
(d) Neither Fred nor George is tall 
(e) It is not true that Fred and George are both short 
Assuming that Fred and George are both tall, which of the above compound 
statements are true? 


. Construct a truth table for the statement ~p v g and compare it with the truth 


table for p => q. 


. Construct truth tables for the following: 


(a) ~(p rq) (6+) (pv¥ ~g)Ar (c) ~{(~pa~g) a(pvn} 
[(@) FTTT (6) TFTFFFTF (c) TTTTTTTT) 


. Show that 


(a) ~pv(pvq) (6) p=>(pvqg) (©) (p>) (~Pvg 
are logically true but p A ~(p v q) is logically false. 


. Show that 


(a) (p=> q) A (g=> p) is equivalent to p<>q 
(b) ~p <> (p=> ~4@) is equivalent to p => g. 
(c) ~p v (~q A ~r) and p=> {p A ~(q V r)} are equivalent. 


. Verify the following tautologies by using truth tables. 


(a) p V (q Ar) and (p vq) A (pvr) 
(b) p \(g vr) and (p Aq) Vv (p Ar) 
(c) (p <>) and {(p=> 9) 4 (q=> p)} 


. If p = “tomorrow will be a fine day”, g = “‘we shall go to the seaside’’, write 


in words the meaning of the following statements. 
(a) p>q (6)q=>p () ~>~q @)~I>~p 
Assuming the truth of (a) which of the others are true? 


. Write in words the converse, inverse and contrapositive of the following 


propositions: 
(a) If a triangle is equilateral, then it is isosceles. 
(b) If two lines do not intersect then they are parallel. 
(c) If a line joins the midpoints of two sides of a triangle then it is parallel to 
the third side. 
Assuming the above propositions to be true, determine the truth or falsity 
of their converses, inverses and contrapositives. 


Insert >, <= or <> as appropriate between the pairs of statements on each line. 
x? + 16 = 10x (x =2) or (x = 8) 
x? + 16 = 10x (x = 8) 


x? + 16 = 10x (x = 2) 


210 Boolean algebras 


11. The following “‘proof” depends upon incorrectly reversed implications. Find 
the false step. 
Let a = b = 1, then a® = ab. Therefore 


a’ — b® =ab — Bb? 

(a + b)(a — b) = b(a — Bb) 
a+b=b5 

2 ‘= 1 


7 Residue classes 


7.1 Congruences 
Consider the following three subsets of the integers: 


A=(...—9, —6, —3, 0, 3, 6,9,...) 
B=(...—8, —5, —2, 1,4, 7, 10,...) 
C=(...—7, —4, —1,2, 5, 8, 11,...) 


It will be seen that, on dividing each of the elements of subset A by 3, 
the remainder will be 0 in every case. When the same division is carried out 
for the elements of the subset B, the remainders in every case will be 1, 
whilst for the subset C, the remainders will all be 2. 

This implies that the elements of the subsets A, B and C are respectively 
of the forms 3r, 3r + 1 and 3r + 2 where r is an integer. It follows that the 
difference of any pair of elements drawn from any one of these subsets is an 
integer divisible by 3. For example from subset B we have 


7 — (—8) = 15 (divisible by 3) 


The elements of B are said to be congruent modulo 3. Similarly the elements of 
A and C are congruent modulo 3. 


DEFINITION 7.1 
If x and y are two integers such that (x — y) is divisible by n (a positive 
integer), x and y are said to be congruent modulo n. This will be expressed by 


212 residue classes 


the notation x = y (mod n) or simply x = y (n). [Note that x + y (n) will be 
taken to mean (x + y) (mod n).] The usual rules of arithmetic apply for the 
addition and multiplication of congruences. Let a = b (n) and c = d (n), and 
let r and s be integers. Then 


a=b-+rn and c=d+s5n 
at+e=b+d+(r+s)n 


Therefore a + c = 6b +d (n) (addition) 
ac = bd + (rd + sb)n + rsn? 


Therefore ac = bd (n) (multiplication) 
Furthermore, for any integers A, 


ha + po = Ab + pd + (Ar + ps)n 


Therefore Aa + wc = Ab + pd (n) 
In particular, when 2 = 1, w = —1, 


a—c=b —d (n) (subtraction) 


The rule for the division of congruences is less straightforward and often 
involves a change of modulo. The following rule for division will now be 
proved. 


THEOREM 7.1 
If ac = bd (n) and c = d (n), then a = b (n/h) where h is the h.c.f. of dand n 
(or of c and n). 
Note that since c = d + sn (s being an integer), the h.c.f. of c and 7 is 

the same as that of d and xn. 

ac=bd+rn and c=d-+sn 
Therefore ac = ad + asn = bd + rn 

(a — b)d = (r — as)n 


Now d= ha and n= hf where «, B are coprime (i.e. the h.c.f. of the 
integers «, B is 1). Therefore 


(a — b)ha = (r — as)hB 
(a — b)x = (r — as)B 


residue classes 213 


Since «, # are coprime, « must be a factor of (r — as). 
Let ka = r — as where k is an integer. Therefore 


(a— b) = kB =knlh 
a = b (n/h) (division) 


Corollary: If ax = bx (n), then a = b (n/h) where h is the h.c.f. of x and n. 
EXAMPLE 7.1 
(A) Illustrate Theorem 7.1 using 15 =190(7) and 5 = 19 (7). 
Writing 15 = 190 (7) and 5 = 19 (7) in the form 

ac=bd(n) and c=d(n) 
gives 


3.5 = 10:19 7) vand 5 = 19 (7) 


The h.c.f. of 5 and 7 (i.e. 4) is 1. Therefore 


3=10(7) (true) 


(B) Apply Theorem 7.1 to the congruence 12 = 192 (9) to deduce a 
relationship between 4 and 16. Then deduce a further relationship using the 
corollary. 


12=192(9) ie. 4.3 = 16.12 (9) 

But 3 = 12 (9) and the h.c.f. of 3 and 9 is 3. Therefore 
4 = 16(3) [true] ie. 2.2 =2.8 (3) 

Therefore by the corollary, since h.c.f. of 3 and 2 is 1, 


2 = 8 (3) [true] 


214 residue classes 


(C) Use the congruence 12 = 192 (9) to illustrate the corollary to Theorem 7.1 
1.12 = 16.12 (9) 
The h.c.f. of 12 and 9 is 3. Therefore 


1 = 16 (3) [true] 


It is easily verified that a congruence is a reflexive, symmetric and transitive 
relationship between pairs of integers: 


(a) Since x — x = 0, x =x (n) Reflexive 


(b) If x = y(n), then (x — y) is divisible by m. Therefore (y — x) is also 
divisible by n, so that 


y=x(n) Symmetric 
(c) If x = y (n) and y =z (n), then x — y = rm and y — z = sn. Therefore 
x—z=(r+s)n ie. x =z (n) Transitive 


Thus x = y (n) is an equivalence relation (see Vol. 1, p. 79) which partitions 
the integers into m mutually exclusive equivalence classes each containing 
subsets of those integers which are congruent modulo n. Each of the elements of 
any one of these classes will give the same remainder or residue when divided 
by n. Since the possible residues are 0, 1, 2,..., m— 1, there will be n 
residue classes corresponding to modulo n. The residue class of all integers 
congruent to r will be denoted by r. 


EXAMPLE 7.2 
(A) The congruence relation, modulo 2, yields two residue classes 


0=(...—8, —6, —4, —2, 0, 2, 4, 6, 8,...) 
1=(...—7, —5, 3, —151,:3; 5, 7,952 .%) 


comprising the even and odd integers respectively. 
(B) Let the integers 0, 1, 2, 3,..., (a — 1) be uniformly spaced around a 
clockface. Now let be placed at position 0, (n + 1) at position 1, (m + 2) 
at position 2, and so on. It is obvious that residue classes, modulo 2, will 
be found at the 7 positions of the clockface, and that residue class r will be at 
position r. Figure 7.1 illustrates the case for n = 6. 


residue classes 215 


Figure 7.1 


7.2 Algebra of Residue Classes 


If any element of the class r is added to any element of the class s, the sum 
will be an element of the class which contains r + s, since 


(r + An) + (s + wn) = (r +s) + (A+ pn 


Since r + s may exceed n, the class which contains r + s will be the class t 
where t=r-+s(n) and 0<t<n-—1 and we define r+s to bet, a 
unique residue class. 

It is obvious that 


r+s=s-+r 
r+(s+t)=(r+s)+t 
r+0=r 


so that for the algebra of residues, having a finite number (n) of elements, 
the commutative and associative laws hold and a zero exists. In addition, the 
residue class 0 is a neutral element. 

We define r — s to be the class containing r — s. 

If any element of the class r is multiplied by any element of class s, the 
product will be an element of the class which contains rs, since 


(r + An)(s + un) =rs + yn 


216 residue classes 


We define r.s to be the class which contains rs. It'is fairly obvious from 
definitions of the sum and product of residue classes, that the distributive law 
holds: 


r(s+t)=ns+r.t 


Also since Lr =r, the residue class 1 is a neutral element. The residue 
classes, modulo n, thus comprise a set of n elements which may be added, 
subtracted and multiplied like the integers 0, 1, 2,..., (n — 1). 

For integers, the equation rx = ry necessarily implies either that r = 0 or 
division of each side of the equation by r, i.e. cancelling r, gives x = y. 

This cancellation law does not necessarily hold for residues. For example 
(i) Modulo 11, we have 


4.2=3.10=8 
On cancelling by 2, we have 
22=—35 = 4 


which is true 
(ii) Modulo 12, we have 


3.4 = 6.8 = 0 
but 
3.1462 and 1142.2 


so that cancellation by 4 or 12 would not be valid. 

The division of one residue by another, which the cancellation process 
involves, is not always possible. The conditions under which division may be 
performed and under which cancellation is valid will be considered in the next 
section. 


7.3 Division of Residues 


It is first necessary to define the meaning of s/r. To interpret this as the value 
of the quotient s/r would be unsatisfactory since s/r is not always an integer. 
A useful definition may be expressed in terms of the multiplication of residues. 


DEFINITION 7.2 
s/r is the value of x which satisfies the equation 


rx=s (7.1) 


provided the solution exists and is unique. 


residue classes 217 


Cases arise where the division s/r is not possible either because no value of 
x exists satisfying (7.1) or because more than one such value exists. For 
example, modulo 6, 


(i) 5.x = 4 has a unique solution x = 2. 
(ii) 3.x = 3 has three solutions x = 1, 3, 5. 
(iii) 4.x = 3 has no solution. 


Division is therefore defined to be possible in (i) only and 4/5 = 2. The 
following important theorem, concerning the division of residues, will now 
be proved. 


THEOREM 7.2 
For the residue classes modulo n, division by r is possible if and only if r, n 
are coprime and r ¥ 0. 

Consider separately the cases when r and n are and are not coprime. 


(1) r, n coprime 
By Euclid’s Algorithm 


rit+nj=1 


where i, j are integers (see Appendix, eqn. A.2, p. 223). 
For any integral value of s 


ristnjs=s ris=s(n) ris=s 


so that is is a solution of r.x = s for any value of s. It follows that if x takes 
each of the n possible values 0, 1, 2,..., m — 1, then r.x must take each of 
the same n different values at least once and therefore once only. 

Thus r.x = s has the unique solution is, as, for example, in case (i) above. 
It follows that if r.x = r.y, then x = y provided r 0, so that the cancella- 
tion law is valid. 


(2) r, n not coprime 
Let / be the h.c.f. of r, n and let r = ha, n = hf. 

Then rx + kn is a multiple of h for any integral value of x. 

Therefore r.x = t where ¢ is a multiple of A. 

It follows that r.x = s has no solution unless s is a multiple of h (as in 
case (iii) above). 

If, however, s is a multiple of h, a number of solutions may be found 
(case (ii) above): 


218 residue classes 


Let s = hy. Then by Euclid’s Algorithm 


rit+nj=h (see eqn. A.1, p. 223) 
riy-+njy=hy=s_ riy=s(n) 


Therefore r.iy = s so that iy is one solution of r.x = s. 


Now rf = na, so rB = 0 (n). Therefore r.8 = 0. Therefore remembering 
hp =n, 


r(iy+kB)=s [k=0,1,2,...,(h—1)] 


Thus iy + kB, where k = 0, 1, 2,..., (2 — 1) and where B = n/h, are h 
different solutions of r.x = s for any given s. 

Therefore by definition, division is not possible when r, n are not coprime 
and, in this case, the cancellation law is not valid since r.x = r.y does not 
imply that x = y (modulo n) though it does imply that x = y (modulo 
B = o/h) as also follows from Theorem 7.1. 

It follows from (1) above that where m = p and p is prime, if x takes the 
values 1,2,...,(p — 1), then ax takes the same set of values in some order, 
provided a is not a multiple of p. Therefore 


a.2a.3a....(p — 1)a = 1.2.3. +++ (p — 1) (modulo p) 
a’1=1(p) provided a4 0(p) 
a? =a(p) provided a £0 (p) 


But a? = a (p) = 0 (p) even if a = 0 (p). We have therefore proved that if p 
is prime, then a? = a (p), and that if a is not a multiple of p, a? = 1 (p). 
This is Fermat’s Theorem. 


7.4 Arithmetic Prime Modulo 


In this case, all the residues* 1, 2, 3,..., (p — 1) are coprime with p; 
therefore division by each of these residues is possible. It follows that, in this 
arithmetic, the rules of addition, subtraction, multiplication and division are 
applicable to this set of p elements and that the set is closed under these 
operations. Cancellation is always valid in this arithmetic. 

For p = 7, the addition and multiplication tables are shown in Table 7.1. 

In the multiplication table, it will be seen that every row and column 
(except the first) contains each residue once and only once. This was to be 
expected since rx = s has a unique solution for all p values of s for a given 


* Residues will be printed in ordinary italic type from here onwards, not bold. 


residue classes 219 


Addition Multiplication 
Tato 2 3.4.5. 6 SFO OM Sst gies FG 
Ono 1. 2% Suid Sane 01000205050) 0) 0 
hile di -2: 34:4) S60 Pee tia2h 3 4 5 26 
Zhe 3, 4:56 aoa DleO) 22. 4) Gadd »id> 5S. 
33.4, 5S. G.sO 2 ouO 3 6 2ecoiedh A 
4) 4: 5) 6 (Oss AsO: 4s le ee «3 
S| 5 6+O P23 4 SHO 9) 3) ie a 2 
616 O. be? 73) 4 5 GAO 6: (Saas: 2 
TABLE 7.1 


value of r since p is prime. Inverses are indicated by the positions of 1 in the 
table. They are seen to be | and 1, 2 and 4, 3 and 5 and 6 and 6. 


7.5 Arithmetic Nonprime Modulo 


In this case, m and one or more of the numbers 1, 2, 3,..., (# — 1) will 
not be coprime. Consequently, whilst the rules of addition, subtraction and 
multiplication apply to all nm elements, division is only possible by those 
residues which are coprime with n. 

For the composite modulo n = 6, the addition and multiplication tables 
are shown in Table 7.2 

Since only 1 and 5 are coprime with 6, only the rows and columns corre- 
sponding to 1 and 5 in the multiplication table will contain each of the 
residues 0,1, 2,... , 5. Since 2, 3, 4 are not coprime with 6, division by these 
residues is not possible. 

The residue equation 2x = 4 has two solutions x = 2, 5 which is con- 
sistent with the fact that 2 is the h.c.f. of 2 and 6 and 4 is a multiple of 2. 


Addition Multiplication 
TiO 2,3 4° 5 KO ur ee os 4) 8S 
0 Omi2 3 4.5 0/;0 0 00 0 
ips 4. 5: 0 LON 2. 34: 5 
Z203 04 5 0 1 2 |'O 2i84 0) 24 
sia ae Oo 1 2 30" S3tne. SG) 53 
Sia vou) dt 2, 3 41;0 420 4 2 
SS ONE 2 3) 4 Sy O'S" 4a 3 2) 1 


220 residue classes 


Similarly 2x = 2 has two solutions x = 1, 4 whilst 2x = 1, 3, 5 have no 
solutions since 1, 3, 5 are not multiples of 2. 

The residue equation 3x = 3 has three solutions x = 1, 3, 5 which is 
consistent with 3 being the h.c.f. of 3 and 6 and also a factor 3. 

On the other hand, 3x = 1, 2, 4, or 5 have no solutions since 3 is coprime 
with 1, 2, 4 and 5. Since 2 is the h.c.f. of 4 and 6, and 2 is a factor of 2 and 4, 
then 4x = 2 has two solutions x = 2, 5, whilst 4x = 4 has two solutions 

= 1, 4, differing by 8 = n/h = 3. Also 4x = 1, 3, 5 have no solutions. 


EXAMPLE 7.3 
(A) Solve 17x = 23 (29). 


Solution Since 29 is prime, there is a unique solution. The solution may be 
obtained by applying Euclid’s Algorithm thus: 


29=1.17+12 12x = —17x = —23 =6 
17=1.12+5 Sx = 17x — 12x = —6—6 = —12 
12=2.5+2 so that 2x = 12x —2.5x=6+24=30=1 
5$=2.2+1 (mod 29) x =5x—2.2x=-—12-—2 
2=2.1+0 =—-14=15 


Solution x =15 


(B) Solve 15x = 6 (27). 
Solution Here 15, 27 are not coprime. Their h.c.f. is 3 so that there will be 3 
solutions differing by n/h = 27/3 = 9. 


—27=15+12 12x = —15x = —6 
15=12+3 so that 3x = 15x — 12x =6+6=12 
12=4.3+0 (mod 27) x=4 


The three solutions are x = 4, 13, 22. 


(C) Solve 5x? — 2x = 11 (13). 


Solution Such equations may be solved by modifying the constant term and 
the term in x until each term is divisible by the coefficient of x, i.e. 5. 


5x? — 2x +4. 13x —11 +2.13 =0(13) 
5x? + 50x + 15 = 0 (13) 


residue classes 221 


Since 5, 13 are coprime, 
x? + 10x +3 =0 (13) 
(x + 5)? = 22 = 9 (13) 
x +5 =3or10 x =11 or 5 


(D) Show that an integer is divisible by 11 if the sum of its digits, taken with 
alternate signs, is divisible by 11. 


Solution 
10=-—1(11) 10?=1 (11) 
10° = —1 (11) 10*=1(11) andsoon 
Any integer may be expressed in decimal form thus 
n= a, + 10a, + 10?a, + 108a, +--+ + 10"a, 
Let s = ay — a, + a, — ag +°°* + (—1)"ap. Then 
n—s = lla, + (10? — 1)a, + (10% + 1a, + (104 — l)a, + --: 


Now (10? — 1), (10® + 1), (104 — 1)... are all congruent to 0 (mod 11). 
Therefore 


n—s=0O/(l1l) and the rule follows 


(E) The smallest positive integer c for which a° = 1 (p), where p is prime, 
must be a divisor of (p — 1). Illustrate by taking a = 4, p = 23. 


Solution Divide (p — 1) by ¢ so that p — 1 = ke +r whereO <cr<e. 


Then a? = a**- a’ =1(p) by Fermat’s Theorem 
Since a** = 1(p) 


a=1(p) r=0 - so that c is a divisor of (p — 1) 
4=64=-—5(23) 44=256=3(23) 4 =9 (23) 
44 = —45=1(23) 4%=1 (23) and 11 is a factor of (23 — 1). 


222 residue classes 


EXERCISES 7.1 
1. Write down the residue classes (i) mod 5, (ii) mod 4. 


2. Write out the addition tables for residue classes (i) mod 4, (ii) mod 5. What are 
the neutral elements and does every element have an inverse in (i) and (ii)? 
[0, yes] 


3. Write out the multiplication table for residue classes mod 5, Is there a neutral 
element? 
Omitting the row and column of zeros, what is the neutral element in the 
table? What are the inverses of 1, 2, 3, 4? 
What are the solutions of the following residue equations, mod 5? 


(i) 4x =3 (ii) x? =1,4 (iii) x? =2,3 
[1;1,3,2,4; (i) 2 (ii) (1,4), (2,3) (iii) no solutions] 

4. Write out the multiplication tables for residue classes (i) mod 4, (ii) mod 8. 
What are the essential differences between these tables and the corresponding 
table for mod 5? Account for them. 

5. What are the solutions of the following residue equations, mod 4? 

(i) 3x =2 (ii) 2x =2 (iii) 2x =3 (iv) YX =1 
[(i) 2, (ii) 1, 3, (iii) no solution, (iv) 1, 3] 

6. Solve the following congruence equations 
(i) 3x = 11 (16) (ii) 11x =2(19) (iii) 6x = 13 (30) (iv) 10x = 22 (31) 
(v) 4x =4 (8) (vi) 24x = 12 (44) (vii) 6x = 12 (30) (viii) 14x = 7 (21) 
[(i) 9, (ii) 14, (iii) no solution, (iv) 27, (v) 1, 3, 5, 7, (vi) 6, 17, 28, 39, (vii) 2, 
7, 12, 17, 22, 27, (viii) 2, 5, 8, 11, 14, 17, 20] 

7. Solve the congruence equations 
(i) 3x? + 4x =1(17) (ii) 6x* — Sx = 27 (37) 

[(i) 4, (ii) 12, 32] 


8. Show that an integer is divisible by 3 or 9 if the sum of its digits is divisible by 
3 or 9 respectively. 


9. Show that an integer is divisible by 7 if the expression 
S = a + 3a, + 2a, — ag — 3a, — 2a, + --: 


is divisible by 7 where ap, a, a,,... are the successive digits of the integer 
expressed in decimal form. 


10, Check Fermat’s Theorem for p = 5, 7, 11 and 17, taking a = 3, 4, 5, 8. 


Il 


Appendix: Euclid’s Algorithm 


To find the h.c. f. of two integers a, b (a > b > 0). Let q, be the quotient and r, 
the remainder when a is divided by b, then 


a=by+r b>rn>0 


residue classes 223 


Similarly, on dividing b by r,, we have 
b=ryge+le b>n>r> 0 


Now divide r, by r, and continue to divide successive pairs of remainders. 
After a finite number of such divisions, say (m + 1), the remainder r,,, will 
be zero and b>r> re >*** > lay = 0. 

We then have the following list of equations 


a=bq +1; 
b=r19. + re 
ry =1og3 tls 


Tp-3 = Vn-2]n-1 ape 

Fy-2 = Vpn + Tn 

Pra = lnQnt1 + 0 
From the last equation, r,, is a factor of r,_,. From the preceding equation, 
it follows that r,, is a factor of r,,». Using these equations successively from 
the bottom upwards, r,, is seen to be a factor of 7,3, Tyas +++» Tas Ta and 
finally of b and a. 

Furthermore, if k is any common factor of a and 5, it follows from the 
first equation that k is a factor of r, and therefore, from the second equation, 
k is a factor of r,. Working downwards through the list of equations, k is a 
factor of r,,. Since r,, is the largest factor of r,,, r, must be the h.c.f. of aand 5. 

It will now be shown that if / is the h.c.f. of a and b, there are integers 
4, « such that 

h=ia+ wb 
in h= Pn—-2 — Tn-14 0 
= lp-2 — Inns =F Tn-29n—v) 
=Ay,-3 + Maln-g Where A,, m, are integers 
= Ayra-a + fa(n—« — Tn-89n-2) 
= Aer,-4 + Mel,-3 and so on 


Continuing this process, we have ultimately 
h=jia+ypb where A, yw are integers (eqn. A.1) 
It follows that, if a and b are coprime, so that h = 1 


1 = da + pb (eqn. A.2) 


& Groups 


8.1 Binary Operations 


In Chapter 1 of Volume 1, we considered the combination of real numbers 
under the binary operations of addition, subtraction, multiplication and 
division. It is useful and fruitful to generalize the concept of a binary opera- 
tion by which two elements of any set are combined to form a third element 
in accordance with some prescribed rule. 


DEFINITION 8.1 

A binary operation, denoted by the symbol », is a rule by which two elements 
x,y ofa set Sare combined to forma third element z = x o y. (Ina generalized 
sense, xe y will often be called the “product’’ of x and y under the binary 
operation .) 


(a) Closure 
A closed binary operation e in S is such that x o y is defined and is in § for all 
ordered pairs x, y. 

(The word “‘ordered’’ is included because x o y is not necessarily the same 
as y ox.) 


(b) Commutative Binary Operations 
A commutative binary operation o in a set S is such that if x © y is defined so 
isyox and xoy=yox, 


(c) Associative Binary Operations 
If the operation © is such that whenever one of the two elements (x o y) oz 


groups 225 


and x o (yz) is defined so is the other and 


(xoy)oz=xe(yoz) 


then is an associative binary operation. 
For example, in the set of real numbers, addition and multiplication are 
associative operations but subtraction and division are not. 


(d) Neutral or Identity Elements 
Let e be an element of S such that, whenever x o e is defined, 


xeoe=eox=x 


then e is called a neutral or identity element of S for the operation o. 
For example, in the set of real numbers, 1 is the identity element for the 
operation of multiplication and 0 for addition. 


(e) Inverse Elements 
Let © be a closed binary operation in S and let e be the corresponding identity 
element. If x, x’ € S are such that 


XOX =X ox —e 


then x’ is the inverse of x and x is the inverse of x’. 

For example, in the set of real numbers, every real number x has an inverse 
—x under addition and, with the exception of 0, has an inverse 1/x or x? 
under multiplication. 


8.2 The Nature of Groups 


Amongst the mathematical systems comprising sets, which are subject to 
only one binary operation, the most important are those known as groups. 
Since the laws applicable to groups are few and simple (see Definition 8.2 
below) many structures are groups. It frequently happens that several 
apparently unrelated systems have the same group structure. Consequently, 
a study of the group structure of one particular system will provide an 
analysis of the properties of each of the systems. 


DEFINITION 8.2 
A group is a set S, subject to a binary operation °, and satisfying the con- 
ditions: 


226 groups 
(i) Closure: for every x, y ES, ) 
xoyes 
(ii) Associativity: for every x, y, zE S, 
(xo y)oz = x0(yoz) 


(iii) Identity or Neutral Element: there is a unique element e € S such that, 
for every x € S, 


xoe=x=e0x 
(iv) Inverses: for every x € S, there is an element x € S such that 
XOX = Cs ore 


A group which obeys the commutative law for products so that for every 
xyes 


xoy=yox 
is called an Abelian group.* 


EXAMPLES 8.1 
(A) The set of integers is a group with respect to addition. The set of positive 
integers is not a group with respect to addition since it has no identity element 
nor is it a group under multiplication since it has no inverses. 

The set of positive integers and their reciprocals do not form a group under 
multiplication since products are not always in the set, e.g. 


5 x 4=8 which is not in the set 


(B) The set of complex numbers forms a group under addition and multiplica- 
tion. 


(C) The set of residues (mod 5), excluding 0, form a group under multiplica- 
tion. 


* N. H. Abel, the Norwegian mathematician (1802-29), was a pioneer in group 
theory. 


groups 227 


Keieke, (3: 4 
rf bez 3 4 
2a) 1 3 
sis 2 & 2 
4/432 1 
TABLE 8.1 


The multiplication Table 8.1 for the set of residues 1, 2, 3, 4 (mod 5) shows 
that they form a group under multiplication since 


(i) the product of every pair exists, is unique and is in the set; 


(ii) the associative law holds, e.g. (2.4).3 =3.3 =4and2.(4.3)= 
2°2=4; 


(iii) the neutral element is 1; 


(iv) every element has a unique inverse: the inverse of a is the unique 
solution of ax = 1. 


The group is Abelian. 


(D) The set of residues (mod 4) under addition. 
The addition Table 8.2 for the set residues 0, 1, 2, 3 (mod 4) shows that they 
HO. 1 2) 3 


0 Or 2 3 
aa oO 
2Hir2, 5 0 1 
Sto 0 T 2 

TABLE 8.2 


form a group (Abelian) under addition with a neutral element 0 and the 
inverse of any residue x is (4 — x). 


(E) Let the rectangular axes Ox, Oy be rotated about O through an angle 
a(—m2 <a <7) in the anticlockwise sense into the positions Ox’, Oy’ 
(Fig. 8.1). 


Let r, denote this rotation; it will be equivalent to the following trans- 
formation of coordinates: 


x’ =xcosa+ysina y =-—xsin« + y cosa 


Let r,, rg be a pair of transformations and let their product rr, denote the 
transformation resulting from a rotation through an angle « followed by a 


228 groups 


Figure 8.1 


rotation through an angle f. Thus, rgr, defines a closed binary operation in 
the set of rotations about O. The set of all transformations r, form an Abelian 
group, containing an infinite number of elements, for which the neutral 
element is ry and the inverse of r, is r_,. 


(F) With reference to Chapter 9 of Volume 1 (Vectors and Matrices), the set 
comprising the following eight simple transformations forms a non- 
Abelian group under multiplication. 


1 0 : 
I= ( ) Identity 


Reflection in Ox 


nS 
I 
—— 


-_- Oo _ 
| 
a 


by 
ll 


0 
) Reflection in Oy 


ee 
o 4 ‘ 
C= 1 0 Reflection in the line y = x 
0 -!1 
Di be 1 5) Reflection in the line y = —x 
ee be. rter t 
= ( 1 0) Quarter turn 
F © ) Half t 
ae te ee alf turn 
oe 
G= yee Three-quarter turn 


groups 229 


Under the operation of multiplication, any pair of the above matrices leads 
to another member of the set (closure). For example, a half turn followed by 
a reflection in Oy is equivalent to a reflection in Ox: 


—1 0O\/-1 0 1 0 
o 2 0 -1 0 -—1 
It may easily be verified that FB = A also, so that BF = FB = A. In general, 


products are however non-commutative, for example CB = G, whilst BC = E. 
Table 8.3 shows the results of multiplying pairs of elements of the set (element 


SOLE Ae BC. Dass. G. 
p 1A BY (Caer 1G 
AWA IT F (Gee iC 
B\ eS F I £ Ge A D 
cic £ CG i Ve eAzwp 8B 
DiD G E Fine wG A 
Be C D Baars I 
Bir Bb A D CVG, £ 
GG D C A BPE F 
TABLE 8.3 


of row postmultiplied by element of column). It will be observed that each 
of the eight elements occurs once only in each row and column of Table 8.3 
and that there is an identity element J. 

The operation of multiplication is associative, for example 


C(BD) = CG =B (CB)D =GD=B 


ie. C(BD) = (CB)D. This will be found to be true for all triplets of elements. 
Every element has an inverse which is identified by the position of J in the 
multiplication table. Elements J, A, B, C, D and F are their own inverses 
which is otherwise obvious on geometrical grounds. Also EG = I = GE. 

The set of eight transformations is thus a non-Abelian group. It will be 
observed that the subset J, E, F, G also form a group under multiplication, 
the product table being as in Table 8.4. It is said to be a subgroup of the main 
group. Subgroups will be considered further in Section 8.12. 


230 groups 


MS hls ek Gt 
bE HEE 1G 
BERG 
Py ee Ge eS Te 
Gar 
TABLE 8.4 


(G) Table 8.5 defines a binary operation © between four elements e, a, b, c 
comprising a set. It may easily be verified that e, a, b, c form a group. This 
group of four elements is known as the Vierergruppe. Such a set of abstract 
symbols, having a group structure, is called an abstract group which serves 
as a mathematical model from which general deductions may be made which 
are applicable to particular sets having this group structure in common. 


D6) é e€ 
Ce 'b. @ 
TABLE 8.5 


a) 


(H) With reference to Section 7.5, it is clear that residues (mod 6) excluding 
0 do not form a group under multiplication since x = 2, 3 and 4 have no 
inverses whilst 2.3 = 3.2 = QOand3.4 = 4.3 = 0 and 0 is not in the set. 
We have seen that rx = 1 (n) has a solution only when r and n are coprime. 
Residues for a prime modulus will form a group under multiplication if 0 is 
excluded. 


8.3 The Order of a Group 


Examples (C), (D), (F), (G) above concern groups each of which contains a 
finite number of elements; such groups are known as finite groups. A 
“product” table may be produced for any finite group. The number of 
distinct elements in a finite group is called the order of that group, e.g. the 
order of the Vierergruppe (G) is 4. 

Groups such as those in Examples (B), (E), which are not finite groups, 
are said to be of infinite order. 


groups 231 


8.4 Notation 


The following notation will be used throughout the remainder of this chapter: 
G will denote a general group and e its identity element. The product of 
elements x, y will be denoted by xy and the inverse of an element x by x7. 


8.5 The Inverse of a Product 


To prove that (xy) = y“1x-1. Since the associative law (xy)z = x(yz) 
applies to the group, 

(73x73) (xy) = ye 1(xy)] = y7AIG ay] = ytey = yy = 
and 


(xy) (3x) = xLyQrtx))] = x[(yy)x7] = xex = xx =e 


It follows that y~x~? is the inverse of xy. 


8.6 The Index Laws 


We define x? = xx, x® = x°x = xxx, x4 = x8x,...,x" = x"—1y. Since the 
associative law holds, it is fairly obvious that the usual index laws 


eX = x and) (4")* x” 


also hold for m, n positive integers. 
Also, since by the associative law 


ex)” =e 


it follows that (x~)" is the inverse of x”, i.e. (x")"? = (x73)”. 

The above laws are readily extended to apply to cases where m, n are 
positive or negative integers. As in elementary algebra, we define x° to be e 
and x" to be (x")", 2 > 0. 


8.7 Isomorphism 


If, in Table 8.1 (p. 227) which is the multiplication table for residues (mod 5), 
the abstract symbols e, a, c, b are substituted for 1, 2, 3, 4 respectively, we 
obtain the abstract group table shown in Table 8.6. By interchanging the order 
of b and c, this table becomes as Table 8.7. Again, if in Table 8.2 (p. 227) 
which is the addition table for residues (mod 4), the symbols e, a, b, c are 


9 


232 groups 


& @ 6 6 
ele CG. 0B 
ala exc 


CuGeO a 

CA A: ST 

TABLE 8.6 

substituted for 0, 1, 2, 3 respectively, we obtain the abstract group Table 8.7 
once again. 


@ @ubuc 


Sie ao: ec 


a@\@ bc 


b|}be¢e 
Che 2 wb 
TABLE 8.7 


Hence, if the one-to-one correspondences in Table 8.8 are used, the group 
structures of the residues (mod 5) under multiplication and of the residues 


xmod5 +mod4 


e-> 1 > 0 
a-> 2 > 1 
b-> 4 > 2 
c= 3 > 3 
TABLE 8.8 


(mod 4) under addition are identical. Two such groups are said to be 
isomorphic. 


Infinite groups, as well as finite groups, exhibit isomorphism. For example, 
when both are under addition the group of integers (positive, negative and 
zero) is isomorphic to the group of even integers, with the following one-to- 
one correspondence 

vies aman LOFT 2 5) 6 aes Beck 
ies) HBS eh 225 O, 2545 65) 5c5)s' 5 Dioats 


The “products’’ of two elements from each group exhibit the same corre- 
spondence: 


(m + n) from the first group —> (2m + 2n) from the second 


groups 233 


N 


w 


EXERCISES 8.1 


. State which of the following sets are groups for the specified binary operations. 


For those which are groups, give the identity element and some examples of the 
inverses of the other elements. Give reasons why the remaining sets are not groups. 


(a) Sets of numbers 
(i) Positive integers (excluding 0) under addition. 
(ii) Even integers (positive, negative and zero) under addition. 
(iii) All integers (positive, negative and zero) under multiplication. 
(iv) Odd integers under addition. 
(v) Real numbers under multiplication. 
(vi) Real numbers under division. 
(vii) Rational numbers under multiplication. 
(viii) Rational numbers, excluding 0, under multiplication. 
(ix) All integers (positive, negative and zero) divisible by 3 under addition. 
(x) All numbers of the form 3” (with n a positive or negative integer or zero) 
under multiplication. 


(b) Sets of residues 

(xi) Residues, mod 4, under multiplication. 

(xii) Residues (0, 2, 4, 6), mod 7, under addition. 
(xiii) Residues excluding 0, mod 7, under multiplication. 

(xiv) Residues, mod 8, under addition. 

(xv) Residues excluding 0, mod 8, under multiplication. 

(xvi) Residues (1, 2, 4, 5, 7, 8), mod 9, under multiplication. 
(xvii) Residues (1, 3, 4, 5, 9), mod 11, under multiplication. 
[Nos. i, iii, iv, vi, xi, xii and xv are not groups for the reasons given below. 

(i) No neutral element. (iii) No inverses. (iv) No identity element (unless 

0 included). Closure requirements unsatisfied. (vi) Associativity requirements 
unsatisfied. (xi) No inverses for 0 and 2. (xii) and (xv) Closure and inverse 
requirements unsatisfied.] 


. Show that the set of residues excluding 0, mod 12, does not form a group under 


multiplication. Show further that subset (1, 3, 5, 7, 9, 11) is not a group but that 
the subset (1,5, 7,11) is a group. What are the inverses of the elements of 
(1,5, 7,11)? [Inverses are (1, 5, 7, 11)] 


. Complete the table on p. 234 for multiplication modulo 8: where the entry in the 


rth row and sth column is the remainder on dividing the ordinary arithmetical 
product rs by 8. 

Find the largest subset of the numbers 1, 2, 3,...,7 which forms a group 
under this rule of combination. 

Prove that, if a, b, x are elements of the subset just obtained, and if ax = 
6 (mod 8), then 


x = ab (mod 8) 
(Oxford and Cambridge G.C.E. A-level, S.M.P.) 


) 
oO 
rs 


groups 


YA Ue wD =| X 
A nk wWwN 
ho 6 A Be 


az 


4. Give reasons why the following sets are groups: 
(i) Vectors of the form (kx, x), where k is a scalar constant, under vector 
addition. 
(ii) Non-singular (2 x 2) matrices under multiplication. 


8.8 Permutation Groups 


Let n different objects be labelled 1, 2, 3,..., m. These objects may be 
arranged amongst themselves in m! different ways, i.e. there are n! permuta- 
tions. Each of these permutations of n objects may be represented by a 
particular order of the numbers 1, 2, 3,..., ”, and is defined to be a permu- 
tation of degree n. 

A permutation of degree n may conveniently be written in the form 


( PMs ae Gp VA. "| 
ay a, as ove.e a; O10 2 a 
which indicates that the numbers 1, 2, 3,..., are replaced by a different 
order of themselves, namely by a,, dp, ds, ..., @j,... 5 Gy respectively, i.e. i 


is replaced by a; for all values of i. For example 
ae a 
fs 1 4 ) 
represents the permutation of 1, 2, 3, 4 changing 1 into 3, 2 into 1, 3 into 4 


and 4 into 2. 
The 6 possible permutations of 3 objects are 


Gia Gad Gia & om 


3 
fe 2 ) (| Z :) 
3, 12 gs 2s 


groups 235 


Note that the order of the numbers in the upper row is not significant since 
we are only concerned with the changes which take place so that, for example 


bid Gs) G32) 


all represent the same permutation. It is convenient to denote the permuta- 
tions 


Ce Me Ms i 
by 
Gy 0g Gg * Oa Ae a; 
where i takes the values 1, 2, 3,... , m successively. 


PRODUCT OF TWO PERMUTATIONS OF THE SAME 
n OBJECTS 


i i 
Let | and | 
a; b; 


The product xy is defined to be the permutation which results when 
1, 2, 3,..., m are first rearranged in accordance with the permutation y 


and the resulting integers are again rearranged in accordance with the 
permutation x. For example, if 


1.2) 34 rez 3 4 
x= and y= 
4.3) 2 2 ae | 


- wow fF} W 
a as 
_— 


Nu NY NY NY 


3 

eae. 
[Wastae 
oe 
me ay 4 


* For convenience, the permutation x has been rewritten with the order of its first 
row identical to the order of the second row of y. 


236 


Note that 


oe 
oo a oe 


groups 


so that xy yx. Products are not, in general, commutative. In the general 


case, we may write 


(<a 


whilst 


SYMMETRIC GROUP OF DEGREE n 
i 


It may be shown that the set of all permutations (‘) (N23... 5 5%) 


v 


containing m! elements, forms a group under multiplication. 


The product of any two permutations x, y is unique and is in the set. It may 


readily be verified that the associative law holds for products. 


i 
The identity element is (' = e. For example, 


aN a tles 


groups 237 


i a 
Every element x has an inverse. The inverse of (.,) is (7) and we have 

i\ /a; a; 

xx} = |= =e 
aJ \i a; 
a;\[i i 

~SWISEANGE 
i/ \a, i 


It follows that this set of permutations is a non-Abelian group of order n! 
under multiplication provided n > 3 and is called the symmetric group of 
degree n and will be denoted by S,,. 


8.9 The Order of an Element of a Group 


In the group Table 8.4 (p. 230) for the group (UJ, E, F, G), each element 
of which is a (2 x 2) matrix, let us find the powers of the various elements. 
All powers of the identity element J are equal to J. 

It is easily verified that 


BP=E B=F F=EF=FE=G E*= EG=GE=I 
BES=FE=-E E=JE*=F 


The first four powers of E are E, F, G, I, the four elements of the group and, 
since E* = J, higher powers of E must necessarily repeat the values E, F, G, I. 
Similarly, we find 


F=—F F=I F?=  IF=F F‘=[?=] andsoon 
G=G @=F) 1G? =GF = FG=E 
G!= GE= EG=I1 G=IG=G andsoon 


In the general case, let x be any element of a group. Then, by the closure 
property, the “‘product”’ xx, i.e. x*, is an element of the group (here “product”’ 
is used, in a general sense, to denote the result of applying the particular 
binary operation of the group to a pair of elements). In fact, x, x*, x*,..., 
are all elements of the group. 

In an infinite group, it is possible that all these elements may be different. 
However, there are, at most, different elements in a group of order 7 so that 
ultimately two of these powers of x must be identical. 


Thus for some p, q (q > p), x* = x” 
Therefore x*® =e ie. x*==e (k>0) 


238 groups 


Thus, in a finite group, there are positive values of k such that x* = e. The 
smallest value of k for which x* = e is called the order (or period) of the 
element x. 

It follows that, in the example at the beginning of this section, E, G are 
elements of order 4 (the order of the group), F is of order 2, and the identity 
element J is of order 1, as it is in any group. 


EXAMPLE 8.2 
(A) Consider the residues (mod 12), excluding 0, under multiplication. Only 
the residues which are coprime with 12, i.e. 1, 5, 7, 11, form a group (see 
Exercises 8.1, No. 2). 

The group table for these residues is given in Table 8.9. 


TABLE 8.9 


The powers of the various elements are 


Element Order 
1 1 
5. 3 a1 2 
(ee Je 2 
11 117=1 2 


(B) Consider the residues (mod 7), excluding 0, under multiplication. Since 
7 is prime, they form a group for which the multiplication table is as shown 
in Table 8.10. 


Mit De 2S. 4 SG 
"e LS oa 0 
2 | 2 eee OH eh 38S 
Si eeeOr ae. creme nt ae 
Bil Mie Ly 5 By GO wd 
Shilo) Sa Gra, 2 
GG 5 Au-3 2. 


TABLE 8.10 


groups 239 


The powers of the various elements are 


Element Order 

1 1 
2 2=4,2=24=1 3 
3 2=2,3=3.2 =6, 34 = 3.6 =4, 

3=3.4=5,3=35=1 6 
4 P=2,8=42=1 3 
5 S=4,8=54=6, 54 = 5.6 =2, 

5§ = 5.2 =3, 5§=5.3=1 6 
6 G@=1 2 


8.10 Cyclic Groups 


THEOREM 8.1 
If the element x of a group has order r, then e, x, x?,... , x" are r distinct 
elements of the group. 


Suppose they are not all distinct, and let x" = x" (r—1>m>n> 0). 
Therefore 


x™"*=e iextme t=m—n>Od0 0<t<r—1) 


which is contrary to the hypothesis that the order of x is r. Thus the first r 
powers of x are distinct. 

Also x"+1 = x, x"t# = x®, x"+8 = x3, ... , so that the elements are repeated 
indefinitely for higher powers of x. 


x? = e implies that p is a multiple of r 


Since a group of order n contains n distinct elements, no element can have 
an order greater than n. An element of order n is such that x" = e, and e, 
x, x?,..., x" are n distinct elements of the group and therefore comprise 
the whole group. Thus x generates the whole group. Such a group is said to be 
cyclic and x is a generator of the group. A group may have several generators. 
A cyclic group of order n will be denoted by C,,. 

With reference to Example 8.2(B), it is seen that the residues 3, 5 are 
generators of the group of residues (mod 7), excluding 0, under multiplication 
and that they are the only generators. 

If x is an element of order r (<n) of a group of order n, then e, x, 
x*,..., x" is a (cyclic) subgroup of the main group. For example, in the 
group of residues (mod 7) (Example 8.2(B)), the residue 2 is of order 3 since 
2° =1 (7). We have, 2! = 2, 22= 4, 23=1 (mod7) so that the cyclic 


240 groups 


group generated by 2 is 1, 2, 4. This is a subgroup of the whole group of 
residues (mod 7) and has the multiplication table shown in Table 8.11. 


MoE Seo 
j A ee ee aaa 
21:2) 4 4 
7 I as a 

TABLE 8.11 

THEOREM 8.2 


If x is a generator of a cyclic group of order n, then x’ is of order n provided 
r is prime to n. 


Since x is a generator, then x*” = e, where A is an integer. 

Let (x’)* = e, then rs = An (a multiple of 7). 

Since r is prime to n, s must be a multiple of n, and the smallest value of s 
such that (x’)*=eiss=n. 

Therefore (x’)” = e and x’ is a generator of the group. 


THEOREM 8.3 
If r is not prime to n and / is the l.c.m. of r and n, then x’ is of order //r. 


The smallest value of s such that rs is a multiple of n is given by rs = /, i.e. 
s = I/r. Therefore the order of x’ is //r. 


These theorems may be exemplified from the cyclic group of order 6: 
Cy Sip Mis winte, we ee me 


r = 1,5 (only) are prime to 6. For r = 1, we have x” = x! and since x* = e, 
x is of order 6. 

x* is of order 6 since the successive powers of x* are x°, x19 = x4, x® = x3, 
Pade Sab eT, yaar 

r = 2, 3, 4 are not prime to 6. 


r Powers of x" Order of x" 1 Ilr 
Pap keke male 3 6,3 
CIE ook MW 2 6 2 
4 x4x8 =x8 x2 =e 3 12, 3 


groups 241 


As an illustration of a cyclic group of order 6, we find that 3 is a generator 
of the cyclic group of residues (mod 7), excluding 0, under multiplication. 
It may readily be verified that (3°) is also of order 6, whilst (3%) and (34) are 
of order 3 and (3%) is of order 2. 


INFINITE CYCLIC GROUPS 


An infinite group G is said to be cyclic if there is an element x belonging to G 
such that G consists of 


+g ae og Ky Cy Ny Kies ce eee 
i.e. of all the positive and negative integral powers of x together with the 


neutral element e. The inverse of x’ is x~’. x is called the generator of the 
group. 


8.11 Transformation Groups 


If a rectangle is rotated through 180°: (a) in its own plane about its centre 
in either the clockwise or the anticlockwise sense or (b) about a line through 
its centre parallel to one pair of its sides, the rectangle will occupy the same 
position though its vertices will be in different positions. Such a movement 
of the figure is called a geometrical transformation or simply a transformation. 

Many examples of figures with a high degree of symmetry occur for which 
the set of all such transformations forms a group. 


1. THE SYMMETRY GROUP OF THE RECTANGLE 


Let r denote the transformation resulting from a clockwise rotation of 180° 
about O, the centre of the rectangle (Fig. 8.2), and let p denote the trans- 
formation resulting from a rotation of 180° about Oy || BC. The product pr 


! 
A Wig ie Cc 
ei me 


D c B A A B 
Figure 8.2 


242 groups 


will denote the transformation corresponding to a transformation r followed 
by a transformation p. 

If r is applied twice in succession, the rectangle will be rotated through 
2 x 180° in the clockwise sense, i.e. the transformation r* restores the 
rectangle to its original position. If e denotes the neutral element, i.e. e is the 
transformation which leaves the rectangle quite unchanged, we have r? = e. 
Similarly p* = e. 

At first sight, it might appear that an infinite number of transformations 
are possible but this is not the case since, for example, r* represents a rotation 
through 3 x 180° which has the same effect as a rotation through 180°, i.e. 
r? =r andr‘ =e, r> =r, and so on. 

r-1, the inverse of r, denotes the transformation resulting from an anti- 
clockwise rotation of 180°. 

From Fig. 8.2, it is obvious that pr = rp. 

The set of four elements (e,r,p, pr) form a group since their product 
table (Table 8.12) shows that the set is closed under multiplication, each 


x e r P pr 
Cree aT jpyePiy Be 
r r e pr P 
Po iPr okt a tenk 
pr pr P ' e 
TABLE 8.12 


element has an inverse and, as may readily be verified, the associative law 
holds. This group is an example of the Vierergruppe (Table 8.5, p. 230) with 
the correspondence a = r, b = p, c = pr. It is also called the dihedral group 
of order 4 (‘‘dihedral’’ refers to symmetries involving both sides of the 
rectangle). This group contains four subgroups of order 2 as shown in Table 
8.13. 


xler xe p xe @) (DF 
e \e. r € te ip eh ire) Spr 
i le é Pp ip e pr| pr e 
TABLE 8.13 


2. THE SYMMETRY GROUP OF THE EQUILATERAL 
TRIANGLE 


Let the transformations resulting from (i) a rotation of 120° in a clockwise 
sense about O, the centre of the triangle, and (ii) a half-turn about an altitude 
AN (Fig. 8.3) be denoted by r and p respectively. 


groups 243 


B Grae A A Bie Se BA GB A 
Figure 8.3 


r? represents a rotation through 240° and r* a rotation through 360°. 

Therefore r? = e. 

The set (e, r, r?) is a cyclic (rotation) group of order 3. 

As in 1 (above), p? = e. 

If any one of the transformations e, r, r* is applied and then the triangle 
is turned about the altitude AN, a new transformation is obtained. Thus, 
there is a set of six transformations: 

(e, r, r?, p, pr, pr*) 
It is easily verified from Fig. 8.3 that 
pr?=rp and pr=r’p 
The product table (Table 8.14) shows that the above set of six transformations 


Ae: dif ‘Te pepe ges 


Pre ip. prion. wave 
TABLE 8.14 


form a group. It is, in fact, the smallest non-Abelian group, and is called the 
dihedral group of order 6. 


244 groups 


(Note that many products in Table 8.14 may be simplified by using the 
relationships pr? = rp, pr = r’p above, e.g. pr? .. pr? = p . pr .r? = p*r® = e.) 
This group contains a number of subgroups. Apart from the trivial sub- 
group e, there is one subgroup of order 3, a cyclic (rotation) group, whose 
product table is at the top left-hand corner of Table 8.14. There are also 
three subgroups of order 2 as in Table 8.15. It may be shown that the 


e p e pr e pr 
eje p ef ie x pr ele (pr 
Pipes ipripr Fe. @ pr*"| pres we 
TABLE 8.15 


symmetric group of degree 3 (see Section 8.8) is isomorphic to the symmetry 
group of the equilateral triangle. The symmetric group of degree 3 is of order 
St 4.6.0. 

Any transformation of the equilateral triangle may be represented by a 
permutation of degree 3. For example, consider the transformation r (Fig. 
8.4). Let 1, 2, 3 denote the positions of the vertices of the triangle. 


A B 
Ls | Vax 
B Cc iC A 
Figure 8.4 
The transformation r moves A from 1 to 3, B from 2 to 1, C from 3 to 2, 


so that the transformation r corresponds to a permutation which will be 
denoted by r: 


foe, 
T= 
14 9 


For similar reasons, we write 


; fe hed st hepa! 
r= = 
a. aha 9 8 Ake a 


groups 245 


1 
It may easily be verified from Fig. 8.3 that C : 1 has the same geometrical 


interpretation as the application of the transformation r twice in succession. 
Again (r is applied first and then p), 


( 2 i 2 ] E 2 ) 

pr = 

a ee) i aes oy '3 
(p) (r) 


It may be verified directly, in this manner, that pr? = rp and that r2p = pr. 


, ae (2-3) 
r= => =Tp 
4 ie 22 9.4 ie ae 


(p) (r*) 
; pee he Pade 
=o 9 4/\4 3 2 em fT 
(r*) (p) 


It may also be easily verified that r? = p® = e and that the product of any 
pair of transformations corresponds to the product of their permutations. 
Thus there is one-to-one correspondence between the group G of six geo- 
metrical transformations (e, r, r?, p, pr, pr®) and the six permutations of the 
symmetry group Ss, i.e. the groups G and S; are isomorphic. 

Alternatively, we may say that if the binary operation in a set of six 
elements (e, a, b, c, d, f) is defined by Table 8.16 we obtain a group G which 


Sao ¢ ad of 


Che a, pene 
a apres} 


fife Ze 
TABLE 8.16 


© 


is identical with the symmetry group S, except for the nature of their 
elements. 


246 groups 


3. THE SYMMETRY GROUP OF THE SQUARE 


Since the square has a higher degree of symmetry than the rectangle, it is to 
be expected that this group has a higher order than that of the rectangle. The 
order of the symmetry group of the square is 8. 

Let r denote the transformation resulting from a clockwise rotation of the 
square about its centre and in its own plane through 90°. Let p denote a 
half-turn of the square about a line through its centre parallel to a side (or 
about one of its diagonals). The set of eight transformations (e, r, r?, r*, p, 
pr, pr, pr®), with 


r=e p*=e rmp=pr rp=pr rp=pr 
form a group with product table as in Table 8.17. 
er ea cp orp ars 
CoM li MMe oe Bc | Pp gr. Spr2 pr 
rr ¢ | pr ip) spre ipr 
a 
ae 


rs 


TABLE 8.17 


There is a cyclic (rotational) subgroup of order 4 whose product table is at 
the top left-hand corner of Table 8.17. 


EXERCISES 8.2 
1. Show that the residues, mod 5 and mod 8, form cyclic groups under addition. 
Find the orders of all their elements and identify their generators. 
[Mod 5: 1, 2, 3 and 4 are generators; mod 8: 1, 3, 5 and 7 are generators; 
2 (order 4), 4 (order 2), 6 (order 4)] 


2. Identify the generators of the cyclic groups 
ae2.2.2.29 . O. Garr) 
[(a) x, x?, x3, x4; (6) y, 7] 


groups 247 


10. 


. Show that the residues, mod 5, excluding 0, form a cyclic group of order 4 under 


multiplication, and identify its generators. If x is a generator, illustrate Theorems 
8.2 and 8.3 by determining the orders of x and x*. [Generators are 2 and 3] 


. Show that the residues (1, 3, 5, 7), mod 8, form a group under multiplication. 


Find the orders of its elements and hence show that it is not isomorphic to Cy 
(see Section 8.10). Show that it is isomorphic to the Vierergruppe shown in 
Table 8.5 (p. 230) and specify the correspondence. 

[3, 5 and 7, each of order 2; Vierergruppe a = 3, b = 5, c = 7] 


. Show that the residues (1, 3, 7, 9), mod 10, form a cyclic group of order 4 under 


multiplication. What are the orders and the inverses of its elements? Identify 
its generators. Show that it is isomorphic to the group (0, 1, 2, 3), mod 4, under 
addition with the one-to-one correspondence (1, 3, 7, 9) — (0, 1, 3, 2) respec- 
tively. State another one-to-one correspondence. Verify that corresponding 
elements have the same order. 

[Generators 3, 7; 9 (order 2); 3 is inverse of 7 and vice versa, 9 is inverse of 9; 
another correspondence (1, 3, 7, 9) > (0, 3, 1, 2)] 


. Show that the group of residues, mod 7, excluding 0, under multiplication 


is isomorphic to the group of residues, mod 6, under addition and specify a 
one-to-one correspondence. [(1, 3, 2, 6,4, 5) > (0, 1, 2, 3, 4, 5)] 


. Show that the residues, mod 13, excluding 0, form a group under multiplication 


which is isomorphic to C,. Identify the generators. [Generator is 2] 


. Show that the residues (1, 5, 7, 11), mod 12, form a group isomorphic with the 


Vierergruppe (Table 8.5) under multiplication. Identify the subgroups. 
[(e, a, b,c) > (1, 5,7, 11)] 


. A binary operation is defined on the set P of all ordered pairs (g, h), where g 


is any element of a group G and A any element of a group H, by setting 


(g. hy) ° (g2, hy) = (£182 Ayhg) 


for any g;, 2 in G and hy, hg in H. Prove that (P, ») is a group, i.e. show that 
the operation o is associative, and prove that (eq, ez) is a unit element where 
€q and ey are the unit elements of G and H respectively. Find an inverse 
element for any element (g, /) of P, with respect to the operation e, and prove 
that P contains a subgroup isomorphic with G and a subgroup isomorphic 
with H. 

If G is the cyclic group (1, x, x? = 1) of order 2, and H is the cyclic group 
(1, y, y, y= = 1) of order 3, prove that P is a cyclic group. Find the order of P 
in this case. [Inverse element (ea kh); generator (x, y), order 6] 


1 t-—1 t 
p=t" ££) Us 


1 
Show that the cross-ratios 47,1 — 4, 
form a group where the operation is the substitution of the second factor for 
t in the first factor, e.g. 


1/t -—1 
1/t 


t=) 
t 


ot = =l-t (another element) 


248 groups 


Show that the group is isomorphic to the symmetry group of degree 3, i.e. to 
the dihedral group of order 6. 


11. The elements 0, 1, 2 form a group under addition (mod 3) and the elements 1, 
3(—1 + i/3) form a group under multiplication. Show that the groups are 
mupetpie and specify the correspondence. [(0, 1, 2) + 1, w, w*) where w = 
ee") 

12. Show that, under the one-to-one correspondence (1, i, —1, —i) — (0, 1, 2, 3) 
respectively, the group (1, i, —1, —i) under multiplication is isomorphic to 
the group of residues (0, 1, 2,3) mod 4 under addition and also isomorphic 
to C,. State another isomorphic correspondence. [(1, i, —1, —i) > (1, 2, 3, 0)] 

13. Complete the table of products for a group consisting of four elements, e, a, 
b, c, If the elements of the group are taken from the set of complex numbers 
and the group operation is the ordinary multiplication of complex numbers, 
find all possible selections of the numbers e, a, b and c. Prove also that, to 
within isomorphism, there are only two groups of order 4. 


x TP eua “be 


e a@coar té 
ajljab 
@ ie 


(Oxford and Cambridge G.C.E. A-level) 
[e, a, b, c) a a, 1, 1; 1), qd, seal ks —1); a, i; <i |e =} Gs 1h Sila) 


14. Each of the following is a group of order 4: 
(a) Residues (1, 5, 7, 11) mod 12 under multiplication. 
(6) The symmetry group of the rectangle. 
(c) (x, —x, 1/x, —1/x) where the operation is the substitution of the second 
element for x in the first element. 
ees tit 
z= 
Zea | 3 


Show that these groups are isomorphic. 
1S. Sanda Vg _f 23 4 
Hen (os 4a B@, pe} 
show that yxy? = z, Find (i) x*y. (ii) yx*. (iii) (xy)*. (iv) yxy3. 
afl 2 3 aN. cheese a sy ok UN 25 AN a DSi 
[o() 3 4 ») Go (; 4 3 i)» i ( 23 i (iv) ( 2 4 3) 


16. Show that the three permutations 
293 
Si Uae 


£23 (5 949) 
le Pa | 


groups 249 


form a cyclic group of order 3 which is a subgroup of the symmetric group of 
degree 3. Verify by direct multiplication of its elements, that the symmetric 
group S; is isomorphic to the dihedral group of order 6. Is the isomorphism 
uniquely defined ? 

17. Prove that the Vierergruppe is isomorphic to the subgroup of S, whose elements 
are 

2 3° 4 a5.) ged os 
ae, 3 4 Sam 2) 2 oi: We AS ata 

18. How many axes of symmetry has an equilateral triangular lamina (i) in its own 
plane and (ii) otherwise ? 

With each of the axes of symmetry can be associated a finite group of rotations 
of the triangle into itself. What is the order of each group? 
(Oxford and Cambridge G.C.E. A-Level) 

19. In the symmetry group of the square, verify that 
(a) (pr) =e (6) pr’pr = r° (c) rpr® = pr (d) r'pr® = pr® 

20. Write down the product table for the symmetry group of the rhombus and 
identify the subgroups. 

21. Find the symmetry group of the regular pentagon ABCDE denoting a rotation 
of 72° by r and a half-turn about the axis of symmetry through A by p. Verify 
that each element is of order 2 and that 

rp =pr* (¢ = 1,2, 3,4) 
22. Obtain the product table for the symmetry group of the regular hexagon 


expressing the elements in terms of r, a rotation through 60°, and p, a reflection 
in an axis of symmetry. Show that 


(a) rpr =p = (b) (pr’*? =e = (c) prp = 
and find the inverses and the orders of all the elements. 


8.12 Subgroups 


Examples of subgroups have already arisen in previous sections. Subgroups 
will now be considered formally in some detail. 


DEFINITION 8.3 
A subset H of a group G is a subgroup of G provided that H is a group under 
the binary operation in G. 


It should be noted that a subset of a group G is not necessarily a subgroup 
of G, e.g. the set of integers, positive and negative, together with zero form a 
group under addition but the subset of positive integers does not form a group 
under addition because it does not contain an identity element. 
The above definition requires that, under the binary operation of G, 
(i) H is closed so that for any elements h,, h, € H, hh, € H 
(ii) any element 4 € H has an inverse h € H. 


250 groups 


The associative law is automatically satisfied in H since it is satisfied in the 
group G. 

Also, (i) and (ii) imply that the identity element of G is in H since for any 
he H, h € H by (ii) and hence hh = e € H by (i). 

The following theorem provides a practical criterion for deciding whether 
a subset of a given group is a subgroup. 


THEOREM 8.4 
H is a subgroup of G if and only if hh. € H for all elements /,, A, € H. 


The condition is necessary since if H is a subgroup, we have 
hy € H for all h, € H by (ii) 
Therefore hh! € H by (i) 


Conversely, assuming that h,h.~! € H for all h,, h, ¢ H, we have 
hyhyte Hforallh,eH therefore ee H 


Therefore by hypothesis, for all h,¢ H, ehyeH, i.e. hy+eH which 
verifies (ii) 

Hence /,(h.~)~ € H, i.e. hyh, € H which verifies (i) 

Therefore H is a subgroup of G. 
Any group contains at least two subgroups, the whole group itself and the 
trivial group consisting of the identity element e only. Any other subgroup 
is called a proper subgroup. 


8.13 The Centre of a Group 


DEFINITION 8.4 
The subset of a group, comprising those elements which commute with all 
elements of the group, is called the centre of that group. 


If a group is Abelian, the centre is the whole group. As an example of a non- 
Abelian group, consider the symmetry group of the equilateral triangle 
(Section 8.11). The powers of r commute with each other but none commutes 
with p. Indeed, only one element e commutes with all elements of this group 
so that its centre is e, the trivial group. In a non-Abelian group, the centre 
may be a proper subgroup as the following theorem shows. 


THEOREM 17.5 
The centre of a group is a subgroup. 


groups 251 


Let c, c2 be elements in the centre and let g be any element of the group. 
Therefore 


C8 = gc, and cg = gcz 
CyCo8 = C18Cy = YCyCq_ thus c,c, is an element in the centre 


Clearly e is an element in the centre since it commutes with all elements of the 
group. 
Again, for any g, we have 
Cg = grey cg)" =e 
gcy 4 = cy, 1g (see Section 8.5) 


Therefore c,~ is an element in the centre, and thus the centre is a subgroup 
of G. 


8.14 Cayley’s Theorem 


This theorem, due to the British mathematician Cayley (1821-1895), is one 
of the most fundamental in group theory. It proves that any finite group is 
isomorphic to a group of permutations. 


THEOREM 8.6 


Any finite group G, of order n, is isomorphic to a subgroup of the symmetric 
group S,, of degree n. 


Let 2, 20, 83, --- » Z, denote the n elements of G. The n! permutations of the 
numbers 1, 2, 3,..., , by which these elements are labelled, comprise the 
symmetric group S,,. 
Let x be a particular element of the group G, then the products xg,, xge, 
XZ3,--+» XZ, Will be all different since 
XSi = X85 8i = 8; 
and must therefore be a permutation of the elements g,, 22, 23,---» 8n 
If we now write 
X81 = Si» X82 = Sigs X83 = 8ig>-- + » X8n = 8i, 


then i,, is, i3,..., 7, will be a permutation of 1, 2,3,..., 7. 
Now consider the correspondence 


( ya ae ") 
x<>X=( |. : (8.1) 


r 5oe ee 22 I, 


252 groups 


This correspondence associates one permutation X of the group S,, with 
one particular element x of G. A different element y corresponds to a different 
permutation since xg, = ygy<>x = y. 

Thus in eqn (8.1), we have a one-to-one correspondence between G and 
a subset P,, of n permutations belonging to the group S,,. 


Let 
De n 
yoo Y = EP 
Ji Je Js Jn 
f 2 USy seer 
YX = . . . . 
Jes Sta, Digi ieee 


Now we have 


X81 = 8i, X82 = Si, X83 = Sis said X8n = Sin 


Therefore 
IX81 = VEi, = Si, X82 = V8ig = Si, 5 ik Vag ee ets — 8i;, 
ie dae Berce ays 
wees a) io, ccn pee 
Jix Jia Jis rie Jin 


The one-to-one correspondence (8.1) is thus preserved under the operation of 
multiplication. 

Thus the correspondence is an isomorphism between G and a subset P,, of 
S,. Since G is a group, P,, is also a group. G is therefore isomorphic to a 
subgroup of S,,. 

Consider the cyclic group (e, a, b,c) of order 4 whose product table is 
shown in Table 8.18. 


CEB We 


ee @ 6b 


° 


a8 10: 1é 
bj/b ce 

eC erena 
TABLE 8.18 


> 8 


groups 253 


Let e, a, b, c, be labelled elements 1, 2, 3, 4 respectively. Then 
ae = a will be labelled 2, aa = b as 3, ah = cas 4, ac=eas 1 


Therefore, in accordance with eqn (8.1), 


apa 
aS 
ae | 


Similarly 
ares 

b> 
Mee ae gee 
—— 

C< 
ay Bs 
Salle 

—> 
3°42 


Thus the cyclic group of order 4 is isomorphic to the subgroup P, comprising 
the above 4 permutations of the symmetric group S,. 


8.15 Lagrange’s Theorem 


The possible orders of the subgroups of a finite group G are limited by the 
order of G in accordance with the following theorem. 


THEOREM 8.7 
If H is a subgroup of order h of a finite group G of order g, then h is a factor 
of g. 


If H is a trivial group, the result is obvious. 
Let H be a proper subgroup and let a, a, a3, . . . , dy be its h (<g) distinct 
elements of which one is e. Let x € G such that x ¢ H. Consider the elements 


HAY XAgy XAg, 0°. 5 XA, (8.2) 
The elements (8.2) will all be different since, otherwise, if xa; = xa, for some 


i,j, then a; = a,, which is a contradiction since a;, a, are distinct elements 
of H. 


254 groups 


Also no element of the set (8.2) is in H since, otherwise xa; = a; for some 
i, j, i.e. x = aja; € H, which is contrary to hypothesis. 

The set of elements (8.2) is denoted by xH and is called the left coset* of G 
relative to H. 

The sets H and xH thus comprise 2h elements of G, h of which are in H 
and A in xH. If G has additional elements, let y be one of these elements such 
that ye G but y¢ Hand y ¢ xH. 

The left coset yH contains h distinct elements, none of which is in H. Also 
none is in xH since, otherwise, if ya; = xa, for some i, j, then y = xa,a;, 
i.e. y € XH, which is contrary to hypothesis. 

We now have 3h distinct elements of G, 4 of them being in each of H, xH, 
YH. This process of dividing G into left cosets may be continued indefinitely 
until the finite number of elements in G is exhausted so that eventually G is 
partitioned into r sets each of h elements. Therefore g = rh which proves 
Lagrange’s Theorem. 

The following theorems may be easily deduced from Lagrange’s Theorem. 


THEOREM 8.8 

If G is a finite group of order p where p is prime, G has no proper subgroups. 
p and | are the only factors of p. Therefore by Lagrange’s Theorem, the 

only subgroups of G have orders 1 and p. Therefore G has no proper sub- 

groups. 


THEOREM 8.9 
If x is an element of order r of a finite group G of order 7, then r is a factor 
of n and x” =e. 

By Theorem 8.1, (e, x, x*,... , x") form a cyclic group of order r which 
is a subgroup of G. Therefore r is a factor of n. 

Also x” = e and n = kr. Thus 


x” = x** — (x)* =e 


THEOREM 8.10 
The only group of order p, where p is a prime number, is the cyclic group of 
order p. 

By Theorem 8.9 the order of any element x 4 e of the group must be p 
since p is prime. Thus x generates a cyclic group of order p which must be the 
whole group. It follows that the only groups of orders 2, 3, 5, 7 are cyclic 
groups. 


* The set (a,x, aX, agxX,... , @,X) is called the right coset of G relative to H and is 
denoted by Hx. 


groups 255 


THEOREM 8.11 
If n is composite, the cyclic group C,, has a unique cyclic group C, for each 
factor r of n. 

Let x be a generator of C, so that x” = e. Let n = rm. Consider the 
elements 


By ny in sgh td Oe 


These elements are distinct and form a cyclic subgroup of C,, of order r 
generated by x”. Thus a cyclic subgroup of order r exists. 

It is now necessary to show that this subgroup is unique. Let y be any 
element of a subgroup of order r. 

By Theorem 8.9, 


vme 


Let y = x* where x is a generator of C,,. 

Therefore, y" = x* = e. 

Thus kr is a multiple of n. Let kr = un = prm. 

Therefore, k = um. Thus y = x* = x". 

Thus each element of the subgroup of order r is of the form x”. Since 
there are only r distinct elements of this kind, they comprise the whole of the 
cyclic group of order r generated by x”. 

Thus the cyclic subgroup of order r is unique. 


EXAMPLE 8.3 
(A) The group of residues (1, 3,5, 7) (mod 8) under multiplication has a 
subgroup H = (1, 5). Write down the cosets H1, H3, H5, H7. 


Solution The group table is given in Table 8.19 


Ress 


uw = 
vA wo = 


7 
- 
5 
3 
1 


una — Ww 
wm 2 WU 


TT 
TABLE 8.19 


For 7 = (1,5), 


coset H1 = (1,5)(=1H) coset H3 = (3, 7) (=3H) 
coset. HS = (5, 1) (= 52) coset H7 = (7, 3) (= 7H) 


256 groups 


In accordance with Lagrange’s Theorem, the group is partitioned into 4/2, 
HEL2 ‘sets. 


(B) H is the subgroup (e, r, r?, r°) of the octic group whose table is given in 
Table 8.17 (p. 246). Show that all the left and right cosets of the octic group 
relative to H comprise either (e, r, r?r*) or (p, pr, pr®, pr®). Find the left and 
right cosets of the group relative to the subgroup K = (e, p). 


Solution H = (e,r,r*,r°) 


= (e,r,r*, r®) = He pH = (p, pr, pr*, pr*) = Hp 
eH = He = rH = Hr = f° = Hr* = r°H = Hr* 
pH = Hp = prH = Hpr = pr*H = Hpr* = pr°H = Hpr® 
There are 8/4, i.e. 2 distinct sets. 
For the cosets relative to K, 


= (e, p) pK = (p,e) 
rK = (r, pr*) prK = (pr, r°) 
r?K = (r?, pr*) pr°?K = (pr?, r?) 
rK = (r°, pr) pr°K = (pr®,r) 
Ke = (e, p) Kp = (p, e) 
Kr = (r, pr) Kpr = (pr,r) 
Kr? = (r?, pr?) Kpr*® = (pr?, r?) 
Kr*® = (7°, pr*) Kpr® = (pr®, r°) 


In accordance with Lagrange’s theorem there are 8/2, i.e. 4 sets in each case. 


EXERCISES 8.3 
1. Find the centres of the dihedral groups of orders 6 and 8. Show that they 
contain Abelian subgroups larger than the centre. 
[e, (e, r*), e.g. subgroups in the top left-hand corners of Tables 8.14 and 8.17] 


2. Carry out the proof of Cayley’s theorem with specific reference to the group for 
which the multiplication table is 


eab 
e eab 
a abe 
b Bi€ a 


groups Zot 


3. G is the cyclic group (1, w, w*) (w = e!*/3), the operation being ordinary 
multiplication, and H is the group of residue classes (mod 3) under addition. 
Show that G and H are isomorphic. Find a permutation group isomorphic to 
G and H. 


Ae eas, (| 24 P12 + ae A 
[ae alle {pis dla al 


4. Which of the following subsets are subgroups? 

(a) The rationals in the group of real numbers under addition. 

(b) The integers, positive and negative (excluding zero) in the group of real 
numbers under addition. 

(c) (i) The rationals and (ii) the integers in the group of real numbers (ex- 
cluding zero) under multiplication. 

(d) The integers, positive, negative and zero, which are multiples of an integer 
n in the group of real numbers, under addition. 

[(4) and (c ii) are not subgroups] 

5. Show that (1,5) forms a proper subgroup of the group of residue classes 
(1, 5, 7, 11) (mod 12) under multiplication. Find another proper subgroup. 
[, 7) and (1, 11)] 

6. Show that (1, 4, 13, 16) forms a proper subgroup of the group of the non-zero 
residue classes (mod 17) under multiplication. Find a subgroup of the subgroup 
(1, 4, 13, 16). Is there another subgroup of the main group? 

[(1, 16) is a subgroup of (1, 4, 13, 16) and also of the main group] 

7. Identify the subgroups of (a) the Vierergruppe (4) the cyclic group of order 12. 
[(a) (e, a), (e, 6), (e, c); (6) subgroups are generated by x* (order 6), x* (order 
4), x* (order 3) and x® (order 2)] 

8. Show that H = (1,9, 11) is a subgroup of the group of residues (1, 3, 5, 9, 
11, 13) (mod 14) under multiplication. Write down the cosets 3H, 5H and 
13H in each case. [3, 5 and 13] 


9. Write down the right and left cosets of 
(i) The Vierergruppe relative to the subgroup (e, a). 
(ii) The dihedral group of order 6 relative to the subgroup (e, r, r*). 
(iii) ae page group (e, x, x*,..., x!) (x! =e) relative to the subgroup 
(e, x*, x"). 
(iv) The symmetric group of degree 4 relative to the subgroup comprising the 
four permutations given in Exercises 8.2, No. 17. 


[® (e, a), (b, c), (ii) either (e, r, r*) or (p, pr, pr*), (iii) (e, x4, x®), (x, x5, x®), 


POR gs 
(x?, x, x19), (x3, x7, x4), (iv) As an example, the cosets for 2 = ( ) 
Pm a oe Ss 


PDS ae 2-3: 4 ; Wil Rae | Peer 4 
are AH = ) 
(, 1,2 ;) (, a4 1 ]l 423: 2 Bi 2k A 


Index 


Abelian group, 226 
Abstract group, 230 
Acceleration, radial 
components, 70 
Activity, 106 
Algebra 
Boolean, 190 
Binary Boolean, 195 
of circuits, 198 
of the logic of statements, 204 
of residue classes, 215 
of sets (laws of), 190 
Algorithm, Euclid’s, 223 
Angular momentum, 109 
principle of, 110 
conservation of, 110 
Area, vector, 36 
Arithmetic 
prime modulo, 218 
non-prime modulo, 219 
Associative 
laws, 192, 193 
binary operations, 224 
Associativity, 226 


and transverse 


Bayes’ theorem, 163 

Binary Boolean algebra, 195 

Binary operations, 224, 225 
associative, 224 


Binary operations contd. 
commutative, 224 
closed, 224 
Binomial distribution, 135, 144 
probability generating function of, 145 
Binomial and multinomial theorems, 
143 
Binormal, 76 
equation of, 76 
Bisectors of angles between two unit 
vectors, 18 
Boolean algebras, 190 


Cayley’s theorem, 251 

Central forces, 110 
proportional to distance, 114 

Centre of a group, 250 

Centroids, 7 

Ceva’s Theorem, 13 

Circuits, switching, 198 

Circular helix, 81 

Circulation, 92 

Classes, residue, 211 

Closure, 193, 224, 226 

Combinations, 127 

Combinatorial formulae, 125 
generating functions, 148 

Common perpendicular to two skew 

lines, 50 


index 


Commutative laws, 192, 193 
Complements of sets, 191 
Components of a vector, 14 
Compound probability, 152 
Conditional probability, 154 
Congruences, 211 

addition, subtraction, multiplication, 

211 

division of, 211 

reflexive, symmetric, transitive, 214 
Conjunctions, 

of statements, 205 

of switches, 198 
Conservation of angular momentum, 

principle of, 110 

Contrapositive propositions, 207 
Converse propositions, 207 
Cosets, 254 
Curvature, 75 
Curves in space, 65 
Cyclic groups, 239, 252 


De Morgan’s laws, 192, 193 
Derivatives of scalar and vector 
products, 68 

Diagrams 

Venn, 124 

tree, 168 
Differential geometry, 73 
Dihedral group (order 6), 243 
Direction cosines, 15 
Disjunctions 

of statements, 205 

of switches, 198 
Distributive laws 

for Boolean algebra, 193 

for scalar triple products, 49 

for sets, 192 

for vector products, 40 
Division of residues, 216 
Dynamics of a particle, 106 


Elements 
of differential geometry, 73 
identity, 225, 226 
inverse, 225, 226 
neutral, 215, 226 
Energy, principle of, 108 


259 


Empty set, 191 
Equations (vector) 
lines, 17, 18 
planes, 20, 21, 33, 35, 49, 50 
Equations of tangent, normal and 
binormal, 76 
Equivalence, vector as, 1 
Equivalence relationships 
for congruences, 214 
for statements, 206, 208 
Events, 120 
causal, 164 
independent, 155 
mutually exclusive, 120, 153 
simple, 121 
Expectation or expected value, 132 
of functions of a random variable, 
135 
Euclid’s algorithm, 223 


Fermat’s theorem, 218 
Force 

central, 110 

centre of, 110 
Frenet—Serret formula, 76 


Generating functions 
binomial, 145 
combinatorial, 148 
of the sum of independent random 
variables, 140 
Generator of a group, 239 
Groups, 224 
abstract, 230 
centre of a, 250 
cyclic, 239, 252 
dihedral (of order 6), 243 
finite, 230 
generator of a, 239 
index law for elements of, 231 
infinite (cyclic), 230, 241 
inverse of the product of elements of, 
231 
isomorphic, 231 
nature of, 225 
order of a, 230 
order of a, 230 
order of an element of a, 237 
permutation, 234 


260 


Groups contd. 
symmetric, 236 
symmetry, 
rectangle, 241 
equilateral triangle, 242 
square, 246 
transformation, 241 


Helix, circular, 81 
Hypothesis (false, true), 206 


Idempotent laws, 191, 193 
Identity element, 225 
Impulse, 106 
Impulsive force, 107 
Independent trials, 173 
Infinite cyclic groups, 241 
Integral, 
line, 91 
surface, 95 
time, 107 
of vector function of a scalar, 87 
volume, 101 
Inverse 
element, 225 
of a product of elements of a group, 
231 
propositions, 207 
Isomorphism, 196, 204, 205, 208 
of groups, 231, 244, 245, 252, 253 


Joint probability density functions, 137 


Kepler’s laws of planetary motion, 112 
Kinetic energy, 106 


Lagrange’s theorem, 253 

Law of gravitation, 111 

Laws 
associative, 192, 193 
of Boolean algebra, 192, 193 
commutative, 192, 193 
distributive, 192, 193 
de Morgan’s, 192, 193 
identity, 192, 193 
vector algebra, 5 

Line integral, 91 


index 


Linear 
dependence of vectors, 22 
momentum, 106 
transformation of vectors, 181 
Logic of statements, algebra of, 204 


Markov chain processes, 177 
Matrix 
stochastic, 178 
transition, 178 
Mean value, 133 
Menelaus’ Theorem, 12 
Modulo, 211 
arithmetic prime, 218 
arithmetic non-prime, 219 
Moment 
of a force (vector notation), 45 
of momentum, 109 
second, 135 
Momentum, linear, 106 
Motion 
under a central force proportional to 
distance, 114 
Kepler’s laws of planetary, 112 
of a particle under gravity, 108 
Multinomial 
distribution, 175 
theorem, 143 


Neutral element, 215, 225 
Newton’s law of gravitation, 111 
Normal 

component of acceleration, 83 

equation of, 76 

plane, 76 

to a surface, 97 


Orbits, planetary, 111 
Order 
of a group, 230 
of an element of a group, 237 
Ordered pairs, 224 
Osculating plane, 76 
Outcomes and events, 120 


Parallelogram, law of addition, 2 
Periodic time, 113 

Permutations, 126 

Permutation groups, 234 


index 


Plane 
bisecting the angles between two given 
planes, 35 
through three points, 21, 49 
through a point parallel to two 
vectors, 20 
perpendicular distance of a point 
from a, 34 
through a line parallel to another line, 
50 
normal, 76 
rectifying, 76 
Planetary 
orbits, 111 
motion (Kepler’s laws), 112 
Poisson distribution, 136 
Principal normal, 75 
Principle of 
angular momentum, 110 
duality, 192 
energy, 108 
the conservation of angular momen- 
tum, 110 
Probability 
compound, 152 
conditional, 154 
density function, 132 
distribution, 132 
of an event, 124 
generating function, 139, 140, 145 
joint (density function), 137 
of simultaneous occurrence of two 
events, 153 
theory, 120 
transition, 178 
vector, 179 
Products 
of two permutations, 235 
of four vectors, 56 
vector, 38 
Propositions, converse, inverse, contra- 
positive, 207 


Radial and transverse components of 
velocity and acceleration, 70 
Radius of 
curvature, 75 
torsion, 76 
Random variable, 132 


261 


Ratio theorem, 6 
Reciprocal groups of vectors, 58 
Rectangular unit vectors, 13 
Rectifying plane, 76 
Relationship, equivalence, 208, 214 
Residues 

algebra of, 215 

classes, 211 

division of, 216 
Rotation 

vector, 36 

of rigid body, 46 


Sample 
points, 122 
space, 122 
Scalar, 3 
product of four vectors, 56 
product of two vectors, 25 
triple product of three vectors, 47, 49 
and vector fields, 62 
and vector products (derivatives of), 
68 
Second moment, 135 
Sets 
algebra of, 190 
complements of, 191 
empty, 191 
identity, 191 
universal, 191 
Sine law, 43 
Spherical trigonometry, 58 
Standard deviation, 136 
Statements 
algebra of the logic of, 204 
biconditional, 206 
compound, 204 
conditional, 206 
conjunction of, 205 
disjunction of, 205 
logically true, 207 
Stochastic 
matrix, 178, 182 
process, 168 
Subgroups, 229, 244, 249, 250 
Surface 
integrals, 95 
normal to a, 97 
Switching circuits, 198 


262 


Symmetric group, 236 
Symmetry group of 
equilateral triangle, 242 
rectangle, 241 
square, 246 


Tangent, equation of, 76 


Tangential and normal components of 


acceleration, 83 
Tautology, 208 
Theorems 


binomial and multinomial, 143 


Cayley’s, 251 
Ceva’s, 13 
Fermat’s, 218 


integral, 107 

periodic, 113 
Torsion, radius of, 76 
Transformation 

geometrical, 241 

groups, 241 
Transition 

matrix, 178 

probabilities, 178 
Tree diagram, 168 
Trials, independent, 173 
Trigonometry, spherical, 58 


Variance, 136 

Variable 
independent, 138 
random, 132 

Vector 
addition, 1, 4 
algebra, 3 
angle between two, 28 
area, 35 
base, 16 


index 


Vector contd. 
components, 14, 16 
derivative of a, 62 
as in equivalence class, 1 
equation of a plane, 33 
fields, 62 
form of the moment of a force, 45 
function of position, 62 
function of a scalar, 87 
linear dependence of, 22 
linear transformation of, 182 
multiplication by a scalar, 5 
position, 5 
probability, 179 
product 
derivative of, 68 
determinantal form, 42 
distributive law, 40 
of two vectors, 38 
of three vectors, 47 . 
of four vectors, 56 
vector triple, 54 
reciprocal groups of, 58 
rectangular unit, 13 
rotation, 36 
of a rigid body, 46 
scalar product (distributive law), 27 
scalar triple product, 47 
scalar products of 
two vectors, 26 
four vectors, 56 
sum, 2 
velocity, 66 
Venn diagram, 124 
Vierergruppe, 230, 242 
Velocity 
vector, 66 
radial and transverse components of, 
70 
Volume integrals, 101 


iy 
) a 
‘eel 
iv il r 
rE Br 
( 
| 
. 
| mm | , : 
| ye! i a 


Topics in Modern Mathematics 2 


* 


During the last few years the need to provide a more effective programme of 
mathematical education in schools and colleges has been widely recognized. 
Reforms in the curriculum have been proposed by various organizations with the 
aim of promoting an early understanding of the basic structure of mathematics and 
of eliminating outmoded traditional material. It has become desirable and 
practicable to introduce topics, formerly reserved for more advanced courses, at an 
earlier stage; for example, the concepts and language of set theory are now 
recognized as providing an excellent medium for promoting the understanding and 
appreciation of a wide range of mathematical topics. 


The two volumes of this textbook cover the topics of modern mathematics which 
have been given prominence in recent years. 


Volume 2 contains advanced work, including vector integration and differentiation, 
regan theory, Boolean algebras and group theory. These topics are included in 
NC and HND examinations and also in degree courses in Mathematics, Science 

and Engineering at polytechnics and universities. 


TD H Baber MSc.,PhD., Dip Ed., FIMA is Principal of Farnborough Technical | 
College and has held positions as Lecturer in Mathematics, University of Sheffield, 

Head of Mathematics Department, College of Technology and Commerce, Cardiff, ; 
and Professor of Mathematics, University of Malawi. 


Volume 1 is suitable for Vth and Vith formers, particularly for those studying 
modern mathematics syllabuses. OND/C students will also find it useful. 


This book is also available in a cased edition 


"Pitman Publishing ISBN : 0 273 31682 6 —-g2.40 net 


